One of the last big features for version 1 of WeSay has been in a for while. Someone (I won’t mention any names) did a great job on it but didn’t blog about it. So I’ll see if I can do it justice.
In this screen shot we see the three ways you can now specify sorting:
Sort like another language
If the text sorts just like some major language, just select that language in the list and you’re done.
Many languages based on Latin characters introduce a small number of "special characters" used to represent sounds not covered by A-Z, like a barred i. In these situations, you can specify the rules just like you do in many existing apps, like Toolbox and Lexique Pro. When you choose "custom simple", the rules box is filled with rules needed to sort English. You can enter vernacular works in the "Test Sort" area:
We want the barred-i to sort just after i, so we add it to the rules and click the button:
Normally, these secondary distinctions are enough. But for some languages, tertiary distinctions are needed. We get these in the simple rules by using parentheses. Consider this list of words:
Now, imagine we want the upper-case words to sort together. We need to add in another level of distinction, so that case can trump the accents. We do this by adding parentheses around all case pairs, and putting the two sets of e’s on the same line:
Eric has written up the details on our wiki.
Custom ICU rules
For languages that need them, WeSay also supports ICU tailorings, which look like this:
& C < č <<< Č < ć <<< Ć –for Serbian (Latin) or Croatian
Like many features of WeSay, this simple-to-advanced collating actually lives in our "Palaso Library", which is of course open-source and can be included in other programs. Thus we foresee a day soon when the setup you do in one program (e.g. WeSay) will be trivially usable in other language-development tools.