1-intro.md (5812B)
1 7 august 2021 2 3 # Intro 4 5 I decided that I want to write a chess engine so here you go. 6 7 First of all, this isn't my first time writing a chess engine. I stopped developing my previous engine because I didn't like how much code I had copied. 8 9 So what is a chess engine? At its core (I think) a chess engine which takes a chess position as input and outputs the best move (or at least what the engine thinks is the best move). In order to implement this, we need some sort of protocol to communicate to the engine which position to look at. 10 11 ## UCI Protocol 12 13 UCI (Universal Chess Interface) is the most commonly used chess protocol. The engine sends stuff to stdout, then a gui will read the stuff and process it. The stuff is specified in UCI. 14 15 There are a few basic commands in UCI that can be implemented without programming anything chess related. I'll leave out the exact specifics of how you do UCI until it is implemented. These commands are 16 17 `uci` 18 : Meant to be used to tell the engine to use uci instead of something else. 19 : Give the various options that may be set to change how the engine behaves 20 : The engine is meant to identify itself in response 21 22 `isready` 23 : This command is meant to "synchronise the engine with the gui" 24 : All you have to do is print "isready" 25 26 `quit` 27 : Just quit the engine 28 29 I'm only going to cover these commands for the rest of this article, so no chess just yet. 30 31 ``` 32 inline bool 33 cmd(const char *b, const char c) { 34 return strncmp(b, c, strlen(c)) == 0; 35 } 36 37 int 38 main() { 39 char buf[2048]; 40 41 while (1) { 42 fgets(buf, 2048, stdin); 43 44 if (cmd(buf, "uci")) { 45 puts("id name bpaul-chess v0.0.1"); 46 puts("id author Benjamin Paul"); 47 puts("uciok"); 48 } else if (cmd(buf, "isready")) { 49 puts("readyok"); 50 } else if (cmd(buf, "quit")) { 51 break; 52 } 53 } 54 55 return 0; 56 } 57 ``` 58 59 A uci command has the format "command arg1 arg2 ... argn\n". The \n is important as it represents the end of a command. 60 61 The buffer size of 2048 is to handle the size of other commands in the protocol, although the number 2048 is pretty arbitrary. I only chose it because it is a power of 2. 62 63 strncmp was used because maybe (as a human and not as a gui) I may enter a command such as "uci ". strcmp would say that this is not the same as "uci", but strncmp with n=3 would say that they are the same. 64 65 There may be room to optimize out the strlen in the strncmp. 66 67 WARNING this is completely useless. 68 69 The command c is always constant, and running strlen on a constant will always give the same results, so feeding in the constant would improve speed over using strlen. This can be implemented using the C preprocessor instead of just writing the constant value of strlen for each command. 70 71 ``` 72 #define CMD(s) strncmp(buf, s, sizeof(s)-1) == 0 73 . 74 . 75 . 76 if (CMD("uci")) { 77 ... 78 } else if (CMD("isready")) { 79 ... 80 } else if (CMD("quit")) { 81 ... 82 } 83 ``` 84 85 sizeof gives the length of the string including the null byte at the end, but strlen ignores the null byte, so 1 is subtracted to get rid of the null byte from the count. 86 87 These two versions of the program can now be compared. I am using the `perf(1)` command which is part of linux I think. These two versions are compiled with just -O3. 88 89 I ran the command `perf stat -e instructions:u ./prog < commands` on both versions and read the result. The commands file contains: 90 ``` 91 uci 92 isready 93 quit 94 ``` 95 essentially testing all of the commands so far implemented. 96 97 The first version had about 138505 instructions and the second version had about 138500. The difference was only 5 instructions??? As it turns out there is a gcc flag `-foptimize-strlen` which is enabled by -O2 or -O3, but it is interesting that using the macro still saved a bit of time. 98 99 After some more testing, it turns out that there is actually no improvement. For some reason, changing the binary file's name caused the program to use 5 more instructions, and it had nothing to do with the code. 100 101 By the way these optimizations are completely useless. 102 103 The next article will probably start going into the making of a board representation. Thanks for reading! Have a good day/night. 104 105 Update: 106 OMG I saved like 4000 instructions somehow by rewriting strncmp!! 107 108 Old strncmp: 138505 109 New strncmp: 134630 110 111 I couldn't figure out why the second version was better but I found it trying to optimize for SIMD. Lets hope the improvement did actually come from SIMD and not something else. 112 113 New strncmp code 114 ``` 115 bool 116 new_strncmp(const char *s1, const char *s2, unsigned long n) { 117 int i; 118 char c[4]; 119 120 if (n >= 4) { 121 unsigned long n4 = n >> 2; 122 do { 123 for (i = 0; i < 4; i++) { 124 c[i] = s1[i] - s2[i]; 125 if (s1[i] == '\0' || c[i] != 0) { 126 return c[i]; 127 } 128 } 129 130 s1+=4; 131 s2+=4; 132 n4--; 133 } while (n4 > 0); 134 n &= 3; 135 } 136 137 while (n > 0) { 138 c[0] = *s1 - *s2; 139 if (*s1 == '\0' || c[0] != 0) { 140 return c[0]; 141 } 142 s1++; 143 s2++; 144 } 145 146 return c[0]; 147 } 148 ``` 149 150 Update 2: 151 Ok so I have found out that using instructions is bad because some instructions are faster than others. This means that something else has to be used like cycles. 152 153 Using cycles shows that the new algorithm saves about 10000 cpu cycles which is pretty good (according to roughly looking at manual tests, so it isn't very formal)! Yay! 154 155 Update 3: 156 It turns out that i forgot to decrement n in the last while loop of the new algorithm, so the program wouldn't work if n was not a multiple of 4. Fixing this shows that the algorithm is the same as the one implemented in glibc, so this was all a waste of time. Yay! 157 158 ## Links 159 160 UCI protocol specification 161 http://wbec-ridderkerk.nl/html/UCIProtocol.html 162 strncmp in glibc source 163 https://sourceware.org/git/?p=glibc.git;a=blob;f=string/strncmp.c;hb=HEAD