Posts filled under #ileme

An extract on #ileme

In principle, it would be possible to inflate the number of bytes in an encoding by padding the code point with leading 0s. To encode the Euro sign from the above example in four bytes instead of three, it could be padded with leading 0s until it was 21 bits long 000 000010 000010 101100, and encoded as 11110000 10000010 10000010 10101100 (or F0 82 82 AC in hexadecimal). This is called an overlong encoding. The standard specifies that the correct encoding of a code point use only the minimum number of bytes required to hold the significant bits of the code point. Longer encodings are called overlong and are not valid UTF-8 representations of the code point. This rule maintains a one-to-one correspondence between code points and their valid encodings, so that there is a unique valid encoding for each code point. This ensures that string comparisons and searches are well-defined. Modified UTF-8 uses the two-byte overlong encoding of U+0000 (the NUL character), 11000000 10000000 (hexadecimal C0 80), instead of 00000000 (hexadecimal 00). This allows the byte 00 to be used as a string terminator.

WTF-8 (Wobbly Transformation Format 8-bit) is an extension of UTF-8 where the encodings of the surrogate halves (U+D800 through U+DFFF) are allowed. This is necessary to store possibly-invalid UTF-16, such as Windows filenames. Many systems that deal with UTF-8 work this way without considering it a different encoding, as it is simpler. WTF-8 has been used to refer to erroneously doubly-encoded UTF-8.

British North America (present-day Canada), where slavery was prohibited, was a popular destination, as its long border gave many points of access. Most former slaves settled in Ontario. More than 30,000 people were said to have escaped there via the network during its 20-year peak period, although U.S. Census figures account for only 6,000. Numerous fugitives' stories are documented in the 1872 book The Underground Railroad Records by William Still, an abolitionist who then headed the Philadelphia Vigilance Committee.

Members of the Underground Railroad often used specific terms, based on the metaphor of the railway. For example: The Big Dipper (whose "bowl" points to the North Star) was known as the drinkin' gourd. The Railroad was often known as the "freedom train" or "Gospel train", which headed towards "Heaven" or "the Promised Land", i.e., Canada. William Still, sometimes called "The Father of the Underground Railroad", helped hundreds of slaves to escape (as many as 60 a month), sometimes hiding them in his Philadelphia home. He kept careful records, including short biographies of the people, that contained frequent railway metaphors. He maintained correspondence with many of them, often acting as a middleman in communications between escaped slaves and those left behind. He later published these accounts in the book The Underground Railroad: Authentic Narratives and First-Hand Accounts (1872), a valuable resource for historians to understand how the system worked and learn about individual ingenuity in escapes. According to Still, messages were often encoded so that they could be understood only by those active in the railroad. For example, the following message, "I have sent via at two o'clock four large hams and two small hams", indicated that four adults and two children were sent by train from Harrisburg to Philadelphia. The additional word via indicated that the "passengers" were not sent on the usual train, but rather via Reading, Pennsylvania. In this case, the authorities were tricked into going to the regular train station in an attempt to intercept the runaways, while Still met them at the correct station and guided them to safety. They eventually escaped either to the North or to Canada, where slavery had been abolished during the 1830s.