Update #3: Loki is coming

It’s been quite some time since I wrote here, but we haven’t stopped working at all, in fact I have now a lot of new ideas of how to improve input methods on elementary OS. The good news is Loki will be here and we have some new cool features, the probably bad but really not so bad news is that since Loki beta is already out, Loki stable will be released any time now, so addition of new big features is not really an option. This will not let us add a lot of the most interesting ideas, instead we should start planning to get them on Loki+1.

After trying to make loading of .xkb files on Gala and discovering that was an unsuccessful approach (I still want to do this, if anyone knows how please contact me) I decided to get the most stuff working on Loki as I could. The first one was the Compose key, it had been broken in a while on Loki because this functionality was removed from gnome-settings-deamon and moved to Gnome Shell, we then needed to move this to elementary’s window manager, Gala.

After fixing this, I wanted to get modifier-only shortcuts to switch keyboard layouts working again (the problem is described in bug #1357895), the issue is the normal API to handle keybindings is not really designed to handle those of this type, although they are the most widely used in almost all operating systems. In the bug report Maxim Taranov suggested a work around on comment #52, so I started to make an update to the keyboard settings panel that used this, by the time I was doing so I was surprised to see a merge proposal by Kirill Antonik to Gala that would allow us to get this working in a less hacky way, and supporting much more keybindings, I immediately updated my work to use this instead. By this time I thought we just needed to wait for this branch to get merged on Gala, so we could update the keyboard plug with mine. Gala’s branch got merged but we decided the keyboard plug needed some work still so we got that a bit late, and some people trying the beta experienced a minor breakage, luckily this is all fine now as the keyboard plug has been merged too.

So, what’s new in the keyboard plug?, basically we finally got rid of the Options tab, and have replaced it by options that appear and disappear for layouts that actually need them (I bet most people won’t even know which these are without reading the code, but that’s exactly the point), while still leaving what we think are the most useful options available to users. Maxim and I spent a lot of time discussing which options to provide and I think we managed to get a very nice compromise between user friendliness and flexibility, some of this required some careful parsing of the xkb_config database but I really like the end result.

So that’s all for Loki, but I am much more excited by what will come next, it seems this series of blogs got some attention and since I’ve been able to learn about other languages. For instance I’ve spent some time talking to Jung-Kyu about Korean support, and that’s what I want to work on next, I have some new ideas about adding Korean-specific options to the keyboard plug. We also have contacted a lot of translators to know what their experience has been typing on their language on elementary and we’ve received feedback about CKJV and several other languages. I would really like to speak directly to people with problematic layouts so I can better understand the problem and figure out how to solve it (this is mainly the reason why Korean will be the first, Jung-Kyu is providing a lot of feedback as we go along), so if you have some ideas please don’t hesitate to contact me.

Advertisements

Update #2: The need for Wayland

We left off previously in that I was about to write code to load xkbcommon keymaps into the X server, that soon proved to be not a very good idea because the keymap type on xkbcommon is an opaque structure, this means the authors want to be able to change its organization at their will without breaking someone else’s code, so the only way to manipulate it is through their API. We could technically bypass this but I think the maintainer burden added would be too much, and it would make difficult to update to newer versions of the library. Another way to accomplish this is as a patch to the tree to libxkbcommon, which sounds fine, but as I said before I don’t want to bother other people because this is sort of an experiment, also it would add code there that will be useless once we move to Wayland, and will also make people less willing to move out from X.

After putting this to the side, it occurred to me that I could use the .vapi file I already had for libxkbcommon to manually translate keypresses into their unicode characters and then send them to the applications, I just needed to know where to “tap” the event stream going from a keypress up to when it was sent to an application. This seamed a reasonable idea because I would have to do it anyway to implement ibus (or at least it seemed like the obvious way to do so). Turns out this stream can’t be tapped because it doesn’t even go through Gala, and it also not how ibus is implemented (I guess precisely due to impossibility of intercepting keypress events). So how does this actually work? I’ll try to explain briefly next.

The first warning sign that told me this wasn’t how things worked was how ibus is implemented in Gnome’s Shell, there is a comment in their code that explains this, it says events reach the X server, then Mutter intercepts all of them, lets Clutter handle them through the clutter_x11_handle_event() function which will send them to the corresponding application (Clutter Actor), then the application receives an event that hasn’t been translated yet, notices an input method is enabled and sends it to the d-bus daemon that will push the resulting translation into a gdk event stream that the shell is listening into (through Mutter). Mutter then, assumes every event here comes from ibus, so it sends it back to Clutter so it reaches the corresponding Clutter Actor again.

At this point I decided to verify all this by myself, because if we are filtering all X events through Mutter and sending them to the corresponding Clutter Actor, why can’t we translate them in Mutter and then send them just once without this strange round tripping to the application?. I found 3 points at which I could intercept this event stream to see what was happening.

I knew there was a way of intercepting Clutter events, but nothing guaranteed that the function clutter_x1_handle_event() handled X events by translating them into Clutter events. After some reading of Clutter’s code I verified this was actually true, X events are translated and pushed into Clutter’s event queue. So I thought adding a Clutter event filter would show all keypresses before them being delivered to the application, if this were true I would just need to translate the keycode as I wanted and let the event reach is normal destination. After doing this I found out no CLUTTER_KEYPRESS events where going through (except for the tab and alt key while alt tabbing), what was going on here? at some point we were loosing events, but they were reaching the application because everything worked normally, so was there another event queue somewhere handling this?.

Then I found out another filter on ClutterX11 this one received an X event and a Clutter event, my guess was that this function was called when translating a X event into a Clutter event, so if my function filled the Clutter event appropriately and returned TRANSLATED then I would be doing the translation and continue sending the Clutter event instead of the X event. So I tried listening for X keypress events. Sadly enough, there were still no keypress events to be seen. I could only see some events of type GenericEvent, this seemed like another dead end. After some research I found out that listening for X key events was very naive, now a days to support weird input devices like Wacom tablets we use XInput2 that uses GenericEvents to send bigger events than what X supports. So I needed to decode these GenericEvents (there seemed to be a lot of these, just what I would expect if every keypress and keyrelease was being sent) to know if there were some XI_keypress events going through there. Doing this wasn’t a trivial task because XInput2 does not have much documentation and there are no Vala bindings for this. After creating a .vapi file for XI2.h I still had some trouble because I needed to cast a struct pointer to another struct pointer which is not documented on the manual vapi file tutorial (turns out to be a simple pointer cast on vala). After all this I could finally check if my keypresses were going through here, and guess what… they weren’t, only the tab and alt key I had seen before were going through here.

The third point at which I could intercept events was overriding the method Meta.Plugin.xevent_filter(), I could see exactly where it was being called from Mutter and I am pretty confident all X events that reach Mutter go through this before anything happens, in fact handling the event here would make it never to go through Clutter at all. Doing things here had a problem though, I would have to find the correct Clutter Actor that should get the event by myself, but nevertheless if keypresses were going through here it would be something I could work with. So I copied my code from the ClutterX11 filter here, and if stuff wasn’t disappointing enough already… well, no interesting keypress events were to be seen here either. So again, what’s going on here?.

My conclusion of all this is, key press events don’t ever reach Mutter and consequently neither Gala. This means the comment on Gnome Shell’s code is quite misleading or maybe I misunderstood it from the beginning. Events reach the application (X client) directly then if an input method is enabled these are sent to the window manager to be displayed on the input method window (the bubble). Keycode translation does not happen in the window manager but in the application itself, mostly hidden into Gtk so that application developers don’t have to do this explicitly.

It seems to me like i18n is a hard problem that no one can claim to have solved completely without knowing every language that a unicode string may represent. This makes people unwilling to commit to an API that will be frozen forever. Instead what I feel is that projects just pass the ball around to other projects without no one trying to actually solve it directly. Wayland right now only has basic keycode translation with libxkbcommon just like X did with xkb, and leaves everything else to the applications, but then Gtk tries to make easy to create internationalized applications by hiding all this below some API. Let’s hope wayland-im will improve things.

Where does this leave us regarding Gala?, well I’ve came to the conclusion that what I wanted to do (load arbitrary xkb files) can’t be done in a non hacky way, because currently as an X window manager we don’t have full control of the events going through, this will only be true once we move to Wayland (and even then we may have to deal with the remaining abstractions (Mutter and Clutter). Still I think users can’t wait for Wayland right now so I’ve decided to move away from my original idea and fix this temporarily for the next elementary release just the way Gnome does it. I will then start looking into  moving Gala to Wayland because I think this is what we really need to take the X server out of the middle and let us handle events by ourselves. I also think we need a better infrastructure that allows developers to test things quickly before actually committing and releasing to everyone else, without having to deal with code that is outside in abstract dependencies, so we’ll see how that turns out later.

 

Update #1: Too many layers

I’ve been digging, trying to implement the loading of xkb files on Gala, turns out that stuff has been abstracted several layers deep, which makes it difficult to experiment with this stuff without bothering people upstream and having to make changes and coordinate with several different projects (and teams). Because what I’ve been trying to do is rather experimental, I don’t want to just create patches for a lot of projects and argue they should merge my code because I think it will be good (I don’t even know that myself), instead what I’ve been trying to do is implement stuff and see how well it works out and if I really think it’s useful for others.  I’m also looking to add the least amount of code to Gala, so rewriting the whole keyboard handler seems like the most extreme solution, and I am trying not to come to this.

I will try to explain briefly all the layers involved with keyboard handling on Gala, before in place of this I had about 3 paragraphs trying to do so but they were mostly just ranting about the excessive abstractedness of the whole thing, but in the end I don’t think that’s relevant for the discussion and I don’t have a global understanding of all projects to question decisions made by others, so I will just point out important details I’ve seen I need to get things done. For this I will just show a diagram of how systems interact (it’s important to note that this relationships are only related to keyboard handling I have no idea how these systems interact for example for graphics).

keyboard_handling_deps
Dependencies of keyboard handling

There are 5 projects here, arrows represent where some project calls a library from someone else. Gala is written in Vala which means calling C code from here is not a trivial task. Mutter and Clutter are written in C but using GObject and Glib, this both ease calling it from Vala code but because they are in the end just C code they can easily call C libraries like XLib, or libxkbcommon. This makes them our most accessible interfaces to lower level interfaces from Gala. Also a sidenote about Mutter is that it aims to be both compatible with Wayland and X11 so it has two backends to support this, but it also means we can’t expect Wayland specific functionality to be provided from it.

I think Gala uses Mutter’s X11 backend but I don’t know how can I test this to be sure. The problem of this is that the API provided by Mutter to set the keyboard layout only uses the RMVLO description which seems to be a legacy interface that comes from the fact that this is what setxkbmap does and was the easiest to copy, as opposed to what xkbcomp does which implied understanding the xkb description specification and how to upload it to the X server (both approaches in the end invoke the xkb compiler every time a layout switch happens which is one of the issues I’m trying to solve). Because Mutter cares about providing functionality available only to X11 and Wayland it’s unlikely that an API that uses libxkbcommon to load xkb files will be provided.

Although Mutter with it’s native backend does use libxkbcommon to change the keyboard layouts, when trying to call the function that does this from Gala I stumbled upon several issues, the most important one seems to be that X11 grabs input devices and does not let Clutter listen to events, an assertion on the library fails saying “Clutter is not the device manager”. Also, this interface is not part of the API stability guarantees which on one hand could bring some problems in the future, but even more frustratingly: makes them completely unusable from Vala code, because as it turns out outputting #define symbols through Vala is impossible (we just need to add the line #define CLUTTER_ENABLE_COMPOSITOR_API ) and even worse do it before the #include directive for the library. So at this point using Clutter’s keyboard handling on the native backend becomes unfeasible.

So, what do we do now? Well I’ve come to the conclusion that right now the simplest approach is to make functionality to load libxkbcomon’s keymap format into X11 directly on Gala. Currently for testing what I did before I created a .vapi file by hand for libxkbcommon so I can now use it from Gala, I haven’t seen bindings for XKBLib so I may have to do this next, we’ll see how that all goes. If I’m successful with this then maybe it would help Mutter to include an API for this, so I could upstream some of it if it would actually work for someone else.

As a minor side rant, I have to say that I wished Gala would have a more monolithic design and would not impose OOP through Vala in the way it does. I mean, the most complex, big and by far successful free software project is the Linux Kernel and it’s also a huge monolithic piece of code, I think segregating functionality around several projects adds a lot of overhead for people trying to help. It’s true that the codebase would grow substantially, but I think having code that actually does something as opposed to glue code would ease fixing problems and trying out new things.

Keyboard input methods and i18n on elementary OS

This post contains some ideas I’ve had lately on how to improve the internationalization of input methods on elementary OS (mainly keyboard input). Originally, my idea was to write a blueprint but I think there’s still a lot to be discussed before proposing what to actually do. I will write down my ideas here and hopefully with some feedback we will manage to get a blueprint we can work on.

This is a very long text but it is what I currently need to clearly explain my reasoning. The idea is to summarize everything later, but you can always skip to the end where the most important information is summarized in bullet points.

The problem

There are a lot of different types of keyboard devices, and it’s very hard to guess which one the user has. However this layer has been simplified greatly by the fact that almost all modern keyboards use USB, and that the kernel handles the translation of key presses to keycodes and provides them to us via evdev. In spite of this, even if we can decode keyboards relatively easily, there is also the problem of which language the user wants to type in on their PC, this we cannot guess from basically anywhere but maybe the user’s locale configuration and even if we tried to do so people who type in a language other than English will most likely want to do so in several other languages with very specific needs (applications they use have shortcuts better suited for US layouts, their language can be typed in several ways, or they are fluent in more than 2 languages) currently this has been solved by allowing users to switch their keyboard layouts by using a sequence of keys, but this has been broken constantly because there has never been a definite solution to the problem that encompasses enough languages to suit a very large audience, so languages are patched on top, and then any change breaks something.

In the times when X was developed the keyboard input was very basic and it didn’t offer support for almost any features other than the ones needed for a US layout. This is why the X keyboard extension (xkb) was devised; it was a way to allow switching the symbols that every key had assigned on the fly, it added more modifier keys, and added support for Unicode (I think this wasn’t there before but I’m not sure). This was a good thing but it still left out several languages that are quite complex such as Japanese or Chinese. Later on ibus and several other input methods were created to support these but in order to correctly make layout switching work distributions had to disable several xkb options because xkb modified the keyboard layout at a lower layer than them and would then produce some weird behaviors.

A lot of the design of xkb was influenced by the limitations imposed by the X protocol, and how fast (or slow) computers were back then. But now, people is trying to leave X behind, so we can try to get rid of most of this cruft and try to design a more robust solution that won’t break as often. We want something that supports typing in as many languages as we can, allows users to switch seamlessly between them, and makes it as easy as possible for them.

Current state of things

Some time ago most distributions exposed all xkb options through a very ugly interface, just like elementary actually does. Ubuntu and Gnome were others that did the same, but they decided to remove them in favor of better support for complex languages that used ibus but compromising a lot of these xkb options (which some times can even conflict with each other). This angered users greatly because they couldn’t easily change their layouts as they did before. This was solved but still a lot of the flexibility that the xkb options allowed was lost and feelings were hurt. I think there is no need for such a compromise.

Currently the workflow for anyone trying to type in another language is:

  1. Try to set up the language from the operating system’s settings panel, if they find it there then they are fine and happy.
  2. If this does not work google “how to type in <language name> on Linux/elementaryOS/Ubuntu” and get to some tutorial about configuring ibus or spend hours deciding which of all the options available they should try for their language.
  3. Install ibus (or another input method), and in the case of ibus also install the actual engine for the language they want to type in.
  4. Use the interface provided by the input method (which will always try to override what the operating system does because it knows better), then the user just hopes the operating system can handle this and wait for it to magically work, which some times doesn’t happen.

After doing this, even if they succeed at step 1, there are some caveats, for example a lot of people got used to changing layouts by using both shift keys. They will be disappointed to see this does not work anymore, but the only reason it worked before was because X was the sole manager of the keyboard and this is not the case anymore.

There are other issues with the fact that the panels provided by these input methods look ugly in elementary which is not nice aesthetically.

The solution

To me input methods can be classified in 2 types, let’s call them basic and advanced, basic input methods map 1 input thing to exactly 1 keysym where input thing stands for either one key press, several modifier key presses and another non modifier key or a dead key followed by several other keys, the point here is the computer can know when to translate the input to the needed keysym by itself, either because the keymap file tells it, or a dead key sequence matched. These can be easily specified and configured with xkb and it’s keymap file format to describe layouts.

These files have often been kept hidden from the users and xkb options were provided as an “easier” way of editing them to the user’s needs. I think these options have evolved to fit a lot of particular requirements that not a lot of people actually need like “Left Alt as Ctrl, Left Ctrl as Win, Left Win as Alt” nevertheless distributions often just spit all of them in some GUI to the user. This has grown to a point where I think it may be even simpler to explain the format of a keymap file, than trying to describe what these options do, don’t do, or how they interact when conflicting ones are enabled like “Swap Ctrl and Caps Lock” and “Swap ESC and Caps Lock”. I think the definition of a keymap file is not difficult to understand once you remove a lot of the complexity added back then that we don’t need anymore like groups, geometry, rules. Just the opposite, the flexibility gained by learning this can’t be matched by any GUI or set of options provided by someone. We should just expose this to power users and stop trying to digest it for them.

Contrary to basic input methods, advanced input methods are required when the input character sequence yields to multiple options of keysyms, this happens in languages where you type the sound of a sentence but it can be written in several ways, so the user must choose which is the one they want. In some cases the program even has to guess how to separate characters into words (note that I don’t know any of these languages so this is what I have concluded from reading about them). This would also be the case if we wanted to provide some kind of predictive text input method like the one on phones nowadays. The difference here is the input method can’t know what the user wants to translate their key presses into or when the translation should happen, so it needs to provide an interface to them (usually a popup with options), and then wait for them to choose the correct one.

However, these two do not need different implementations, actually, advanced input methods need a basic input method to get the characters they want to translate into keysyms, so these actually sit on top of basic ones, and should work with them instead of trying to override everything they do.

On top of all this we should provide nice graphical interfaces so users can configure input methods to their liking which basically means provide ways of changing the basic input method and specific options to the advanced method they are using (if they are using one). Advanced methods should be “bundled” with a basic one that the user will be able to change if for example they want to use Pinyin on an AZERTY keyboard instead of QWERTY. The keyboard layout configuration would consist of a set of basic methods and advanced methods each with their own basic method bundled to them.

Implementation

Probably the most important factor here is that X is being replaced by Wayland. This will render all X specific stuff useless but will also give us the opportunity to choose where to go next. Either way we must be aware that some stuff will be useless in some time (code wise) unless we start to move to the actual libraries that will be used on Wayland.

To provide the kind of integration a user would expect from elementary I believe we will need to add some new functionality to Gala. For the basic level of input methods libxkbcommon should be more than enough, this library contains all the currently used layouts on Linux but in an X free environment and is the proposed solution for when we move to Wayland. This library loads a keymap from several sources such as a description like the one used before by setxkbmap with the RMLVO syntax (which is the only one used currently by Mutter), or a full .xkb file containing everything we need for a layout. We should provide a gsettings interface that allows to chose a layout using any of these options.

We also need to finally get rid of the options tab on the keyboard plug on switchboard because as stated before is hard to understand, provides conflicting options, breaks often because X is not the only one controlling the keyboard anymore and is not flexible enough for the user’s specific needs. Instead I have thought about a solution that takes into account the most common scenarios I’ve found people complains about when loosing this tab: “I can’t swap X and Y keys anymore”, “I can’t enable the compose key where I want it” and “I can’t change layouts anymore”. The last one will hopefully be handled by Gala and the fact that layouts will mostly be in one place. So my idea is the following:

Add a small interface that will ask for the user to press a key, and provide a menu of possible common actions to bind it to, these actions can be “Control, Caps Lock, Shift, Alt, AltGr, Compose Key, Menu” and maybe some Japanese specific keys like the kana key but I don’t know enough about this to have a concrete idea. The implementation of this is surprisingly easy if we use libxkbcommon to load a layout file where we create an alias for that key, or change it’s keycode.

remaping-widget
An attempt of mockup for the widget I’m talking about.

For any other use case not handled by this what I suggest we do is allow an arbitrary xkb file to be loaded from the switchboard plug. I really didn’t knew how powerful this was but I think most of the complaints people have can be addressed if they knew how to write their custom keyboard layout file and sometimes can even be better than switching layouts. Even if people is really not willing to learn the file format instead of trying to agree on a set of configuration options it would be easier to design a graphical application that generates layout files in an intuitive way without most of the limitations that the original X input method imposed on the xkb file format, in a similar way as Ukelele does in OSX.

We could also fix some bugs as we move along. Currently every time a layout switch happens libxkcommon is called by Mutter to compile the next layout and load it to Clutter, we could be more clever about this and load all layouts to memory and just switch the reference to the current one. This would fix an annoying bug, where you switch layouts and the first key you type after that doesn’t register because the new keymap file hasn’t been compiled yet.

For the advanced input methods people has always supported ibus but I have found out this is not unanimous and some people prefer others like Fcitx. What ibus did was provide a framework that desktop environments used to display the bubble with options, interpret user’s feedback and send it to the application, it did not provide any language specific capabilities. Instead other people used this framework to create engines for specific languages which are daemons that communicate with the Shell throug D-bus. On Wayland a merge was accepted on Weston that extends the Wayland protocol to allow a preedit section and feedback from the user, this is still a Weston only thing and I haven’t come across information about it being merged into the core protocol. But this new framework will eventually replace what ibus did with a much more standardized version of it. So, on the X side of things we are pretty much left with using ibus but it will be useless once we move to Wayland if im-wayland gets into the core protocol, so we may just support ibus as a temporal solution.

In any case, what I would like to do here (and I haven’t thought about this at length implementation wise, mostly because I don’t know enough about these input methods to have an idea on the requirements) is to provide a set of advanced input methods out of the box which can be selected from the keyboard plug without installing anything, each providing layout-specific options on the right panel. To do this we would need to narrow down the problem to a subset of languages, and then choose one of the input methods available for each of them. Then we would need to be able to configure them from our keyboard plug (I think ibus engines provide a d-bus interface we could use for this). Which, would imply looking at each language and deciding which options are the most useful, this is a very language specific thing and would require someone who is fluent in the language to provide us with feedback.

Undoubtedly, for advanced input methods to happen nicely we need a lot of feedback from people who actually need them.

What needs to be done

If all that was too long to read for you, here is a summary of the key points I think need to be done:

Gala

  • Use libxkbcommon to add a way of loading arbitrary xkb files (this change should actually happen on Mutter).
  • Try to load all layouts on memory and just switch the reference to the current one.
  • Provide a gsettings interface that allows to specify the list of layouts coming either from a file or from an RMLVO description, and also leave room to choose advanced input methods.

Switchboard keyboard plug

  • Kill the options tab.
  • Add a way of loading arbitrary xkb files into Gala.
  • Add an interface that asks for a keypress and shows a menu with options on what action should be binded to it. In a similar way as to how custom shortcuts are added.
  • See if ibus packages are installed and list them on the keyboard plug. Even better a set of pre installed ibus engines can be provided so that people chooses a language and it just works.
  • Provide a subset of the options given on the specific engine’s configuration panel directly on the keyboard plug (this should be doable through d-bus).

Team work

  • Agree on a set of officially supported languages to narrow down the problem.
  • Get at least one person who uses each of these languages on a daily basis and is willing to spend time giving feedback to developers.
  • Decide on an input method per language from all the ones available.
  • Work with each one of the language advisors to agree on a subset of configuration options that will be provided directly from the switchboard plug.

Final Notes

All of this has come to my mind after a lot of time of reading, mostly by myself, about this problem. I’m not entirely sure about everything here but I would really like people to do some mockups for the keyboard plug, I have some in paper that I could try to draw on Inkscape but my skills aren’t great.

Also my native language is Spanish so my assumptions regarding advanced input methods on other languages may be wrong.

Finally, I would really like to get general feedback about this, if you have any comments please feel free to send them to me, my mail is: santileortiz@gmail.com