I tried to eat our own dogfood this AM. Below is an account of what I found.
But before I go, let me congratulate L-P on an excellent job. Although it may look below like I'm bitching and complaining, I'm not. I am astounded by the amount of work that L-P has done in such a short time.
But there are many things that could be improved and I talk about them below.
What I tried to do was to create a translation of my page Alain Désilets on the site, into French and Spanish
The page Alain Désilets did not have a language assigned to it at the start.
I clicked on Translate button, and saw a pick list of languages. I selected French, thinking that this would specify the language of the translation.
Unfortunately, this was a case where the source page did not have a language associated with it, and this picklist was for specifying the language of the source page. So I ended up with a source page in French.
Now, as it turns out, there was a message above the language picklist that said:
But it was in a microscopic font. And since I was in a translation mode, as soon as I saw the language picklist, I assumed it was for specifying the target language. In order to counter that assumption (which I think many users will have), we need to display a much more prominent WARNING: style_ message.
Another thing that could be done to minimize occurences of these kinds of cases, would be to do a better job at guessing the language of a newly created page.
For example, if I just created the present page using a dangling link on page Cross Lingual Wiki Engine Project, which is in English. In a case like that, I think it's safe to assume that the page is in English.
OK, so I changed the language of the original page to English, then proceeded to translate it to French and Spanish.
Then I changed the French page and added original content to it.
Then I went back to the English page, and expected to see some sort of warning that the French page was more up to date.
At first, I didn't see anything related to translations.
Then L-P pointed out to me that there should be a Translation module in the upper left corner.
This tells me that maybe the information about translations and out of dateness should be somehow more prominent, or maybe closer to the language picklist at the top right. But maybe this would be crowding the UI too much.
Looking at the Page translation box, I noticed that it said that there was no page with more up to date content.
I asked L-P why, and he said it's because I didn't tell Mozilla nor Tiki site that I can read other languages.
So I went to my User Preferences page, in the Languages section.
I picked French as a language I can read. This added 'fr' to the list of languages. Then I clicked on Change Preferences button, and the picklist of languages turned blank.
This was confusing to me. I thought there was a bug and it failed to register my prefernece. Eventually, I understood that the text field on the left is where the preferences are stored, and the picklist on the right is just a way to choose a language to add.
Need to make that clearer. For example, the first item of the picklist, instead of being blank, should say something like "Pick a language to add". The text box should have a caption that says "Languages you can read". There should be a button that say "Add" beside the picklist.
Once I changed my user preference to tell Tiki that I can also read French, I did see a box called Page translation which said:
"No translations with less up-to-date content match your preferred languages. More..."
But it seems to me it should say that the French version is more up to date, because this is what matters most to a reader.
L-P said it wasn't a bug. He wrote it that way.
Of course for a translator, it might also be important to know what languages are LESS up to date than the current version.
Here's what I propose.
Where *abc* means a hyperlink with abc as anchor text.
The *preferred langauges* link would take you to the page for configuring your preferred languages.
The *untranslated changes* link would show you pages that are more up to date than the current one.
The *need updating* would show you pages that need updating and that you could do something about.
Finally, the *Translations outside of your preferred languages* link would be a kind of catchall that would list all pages needing updates and all pages that have more recent changes, in all languages supported by the site. This might be useful in case the user has not set his language preferences.
At some point I changed the French version, but for some reason, English did not show up as needing to be updated from French. I tried to reproduce this problem without success.
Eventually, I was able to update the English version from the French. It worked quite nicely actually! I found the French diff on top of the English text box worked quite well, and enabled me to quickly zero in on what needed to be done.
But here are some improvements.
The caption at the top that says:
Should really say something like this:
Also, the diff is not as good as it could be. For example, I changed a phrase "il a été président de la conférence" to "il a présidé la conférence" to .
This got highlighted as follows: "il a pr*a é*sid*t*é *président de* la conférence", which is hard to read. It would have been better if it had come out as: "il a *présidé* la conférence".
Note that I tried different diff styles available in Tiki, and most of them seem to have this sort of problem except for HTML Diff. Unfortunately, HTMLDIff shows a side-by-side diff and we probably just want an inline diff. Oh well, it will have to do for now.
Next, I did two changes on the French, in two shots (i.e. two saves).
Then I went to update English based on French, and I was pleased to see that the diff showed me the two changes together.
I was thinking about this issue when reading L-P's translation tracking architecture. It sounds like with this new architecture, different saves would be considered to be different translation bits. So does this mean that the end user would have to translate them in two different transactions? Or would it be better to consolidate "consecutive" translation bits into one bigger translation bit? What does "consecutive" mean in this context?
On the one hand, consolidating consecutive translation bits is good in that it avoids situations where the user needs to do multiple updates. In particular, if the author writes a sentence, saves, then reformulates that sentence, the translator would not have to translate the original sentence and its re-formulation. He would just translate the final version. (Note: translators tend to work in a mode where they write a first draft, then revise it and refine after the fact, so this may be very common).
On the other hand, keeping consecutive translation bits separate has the advantage of reducing granularity of transactions. This is important because in L-P's new architecture, there is an assumption that when a user is updating a page based on another language, the update is considered completed as soon as the user hits the save button. But if the update is a large one, the user may want to click on save midway through translation. If translation bits are kept short, this is less likely to happen.
A common practice with translators, is to start with the source text and write the translation over it.
When updating the English page based on the French, I found myself naturally copying highlighted content from the French diff, into the English edit text box, and then overwriting it.
But... when I do this, the following caracter: ↵ gets inserted at the end of each line.
I think this is a biggie.
If I update the English version based on the French version, and hit the save button before I have completed the translation, the system considers that English and French are now up to date, and there is no more way for me to remember that it's in fact out of date, and what parts need to be updated.
This is a problem I have encountered when I developved multilingual features in LizzyWiki, and I was never able to find a satisfactory solution to it.
The solution I used was that the system kept nagging the user, asking him if the translation was complete or not.
I suggest the following:
Under the save button, there would be a check box saying
This check box would be unchecked by default.
When the user saves a page, there would be some sort of page alignment checking algorithm that would try to guess whether or not the translation is completed.
If the user checked that box, but the system figures that something is fishy, i.e. the translation does NOT look like it's done, it would display a message saying something like this:
Conversely, if the user did not check the box, but the system thinks that the translation is done, then it would display something like this:
I think it shouldn't be too hard to implement this kind of sanity check on translation completion. In fact there are people in my group who have implemented something like that. And it's something that can lead to a cool paper. I'll investigate.
Something is terribly wrong with the translation tracking.
I brought the system to a state where it showed En, Fr and Es as all being in synch with each other.
Then I added the following change to the En version
Then I went to Fr version, and did Update from En.
This is what the system highlighted as changes that needed to be brought into Fr
Looks like the translation UI is going back too far in the Fr versions.
L-P says he fixed that bug on his local version, but it's not live yet.
Say, I make a change to En. Then I update Fr based on En. Say it's a pretty long translation.
Since I'm a wiki-translator, I'm translating content that I care about and that I know a lot about. In fact, I am also a prolific author on that site.
While translating the change, I spot an error in a part of the text that has nothing to do with the translation I am currently doing. Chances are that I will go ahead and fix the error right therre while I'm still in the middle of a translation transaction.
Then I finish the translation, and I click on Save.
Problem is, the system now considers that En and Fr are in synch, eventhough my correction (which I did at the same time as translation) is not in the En version.
I think we need to figure out a way that people can swithc between translator and author modes within a same transaction. This type of scenario has happend to me often. Because I knew about the issue, I was able to be disciplined about it and just write down the change on a piece of paper, and delay doing it until I was done translating. But even there, I sometimes forgot that I wasn't supposed to do that. So for sure, "ordinary" users will have trouble splitting their personality into "translator" and "author" roles.
Again here, a solution could come from some sort of translattion verification feature. If the system sees a change that looks like it was not a translation of a change on the En side, it would signal that to the user, and maybe allow the user to label that change as being an original contribution as opposed to a translation.
On 2008-02-12, I did an original change to Alain Désilets, English version, i.e. I added this line:
Then, I did an original change to the French version Alain Désilets, fr, i.e. I added this line:
Note the difference in the time stamp.
The system rightly showed that both those pages needed to be updated from the other.
So I updated the English page based on the French, i.e. I added the following translation on the English side:
Then, I updated the French page based on the English.
At that point, I expected the diff to only show the change that was made in English that needed to be reproduced in French, i.e.:
But instead, it showed me all the changes that happened in the English page, since the last time when French and English were in Synch. In other words, it showed me both the original change that was made in English, AND the English translation of the original French. In other words, it showed me:
Ideally, it should only have shown the first of those two lines. But this is hard to do technically.
The reason is that the way that Tiki finds the changes to propagate from En to Fr is that it does a diff of the En version, between the most recent version, and the last version where en and Fr were flagged as being in synch.
If we don't want to show diffs that have been done in En in the course of propagating changes from Fr to En, we need to somehow:
But before I go, let me congratulate L-P on an excellent job. Although it may look below like I'm bitching and complaining, I'm not. I am astounded by the amount of work that L-P has done in such a short time.
But there are many things that could be improved and I talk about them below.
What I tried to do was to create a translation of my page Alain Désilets on the site, into French and Spanish
Issue #1: Problem when translating pages that do not have langage assigned to them.
The page Alain Désilets did not have a language assigned to it at the start.
I clicked on Translate button, and saw a pick list of languages. I selected French, thinking that this would specify the language of the translation.
Unfortunately, this was a case where the source page did not have a language associated with it, and this picklist was for specifying the language of the source page. So I ended up with a source page in French.
Now, as it turns out, there was a message above the language picklist that said:
No language is assigned to this page. Please select a language before performing translation.
But it was in a microscopic font. And since I was in a translation mode, as soon as I saw the language picklist, I assumed it was for specifying the target language. In order to counter that assumption (which I think many users will have), we need to display a much more prominent WARNING: style_ message.
Another thing that could be done to minimize occurences of these kinds of cases, would be to do a better job at guessing the language of a newly created page.
For example, if I just created the present page using a dangling link on page Cross Lingual Wiki Engine Project, which is in English. In a case like that, I think it's safe to assume that the page is in English.
Response from LPH: This issue should now be solved. The message was made bigger in the first place to make it more visible as a warning. Second, the pages now get a default language when being created. If created from a page link in a wiki page, it takes the language from the linking page. Otherwise, it takes the site language (based on user preferences).
Coment from AD: Good! Regarding automatic langauge assignment, I think it's OK to assume that if say, a page is created from a dangling link on an English page, then the created page should be assigned English language. But if the page is created in another way, then not so sure that using the site's default language is the way to go. Might be better to leave it at unknown in those cases. I think the consequences of leaving the language at unknown are not as dire as the consequences of wrongly guessing the language. If the language is unknown, worse thing that can happen is that the user will have to specify the language at translation time. If we guess wrong (say, wrongly assign English), the worse thing that can happen is that the user will eventually try to create a translation in English, and the system will tell it that there is already an English version for that page. When the user goes there, he will see that the "English" version is in fact say, Italian. This will be very confusing to the user, and he may not even know how to fix the damage.
Status report: I (AD) tested this on 2008-02-19, and indeed, language of a page created through a dangling link is automatically determined based on the language of the parent page.
Also, it seems that when you create an orphan page (i.e. not through a dangling link), it will automatically choose the site's default language (English in this case). Not sure that this is necessarily a good idea, but we can leave it like that for now.
Given this, there is little point in worrying too much about the warning message, because the only case where this message would be displayed is if someone explicitly manually overrides the language from English to Unknown (not likely to happen). But it's good that we have that message, because it could still accidentally happen, or it could be that we later revert our design decision regarding defaulting the page language to the site default. Note that I find the warning message to still be a bit too small.
Also, it seems that when you create an orphan page (i.e. not through a dangling link), it will automatically choose the site's default language (English in this case). Not sure that this is necessarily a good idea, but we can leave it like that for now.
Given this, there is little point in worrying too much about the warning message, because the only case where this message would be displayed is if someone explicitly manually overrides the language from English to Unknown (not likely to happen). But it's good that we have that message, because it could still accidentally happen, or it could be that we later revert our design decision regarding defaulting the page language to the site default. Note that I find the warning message to still be a bit too small.
Issue #2: Need more prominent out of dateness status? Maybe not.
OK, so I changed the language of the original page to English, then proceeded to translate it to French and Spanish.
Then I changed the French page and added original content to it.
Then I went back to the English page, and expected to see some sort of warning that the French page was more up to date.
At first, I didn't see anything related to translations.
Then L-P pointed out to me that there should be a Translation module in the upper left corner.
This tells me that maybe the information about translations and out of dateness should be somehow more prominent, or maybe closer to the language picklist at the top right. But maybe this would be crowding the UI too much.
Status report: AD looked at it again on 2008-02-19. The Page translation box at the top right now seems to occupy a bit more space and call attention to it more. But maybe it's just that I now know where to find it. Maybe we could add some sort of internationalisation logo to bring the eye to it more. For example, the i18n logo on this page: http://www.wiki-translation.com/tiki-admin.php, or something like this: http://www.recitlangues.org/projets/documents/esl/flags/PICTURES/FLAGS.JPG
Issue #3: Minor tweak to language preferences dialog
Looking at the Page translation box, I noticed that it said that there was no page with more up to date content.
I asked L-P why, and he said it's because I didn't tell Mozilla nor Tiki site that I can read other languages.
So I went to my User Preferences page, in the Languages section.
I picked French as a language I can read. This added 'fr' to the list of languages. Then I clicked on Change Preferences button, and the picklist of languages turned blank.
This was confusing to me. I thought there was a bug and it failed to register my prefernece. Eventually, I understood that the text field on the left is where the preferences are stored, and the picklist on the right is just a way to choose a language to add.
Need to make that clearer. For example, the first item of the picklist, instead of being blank, should say something like "Pick a language to add". The text box should have a caption that says "Languages you can read". There should be a button that say "Add" beside the picklist.
Status report: AD looked at it again on 2008-02-19, and it was not fixed. Add a tracker item for it.
Issue #4: Page Translation box should provide info relevant for BOTH readers AND translators.
Once I changed my user preference to tell Tiki that I can also read French, I did see a box called Page translation which said:
"No translations with less up-to-date content match your preferred languages. More..."
But it seems to me it should say that the French version is more up to date, because this is what matters most to a reader.
L-P said it wasn't a bug. He wrote it that way.
Of course for a translator, it might also be important to know what languages are LESS up to date than the current version.
Here's what I propose.
There are *untranslated changes* for this page in your *preferred languages*. There are translation in your *preferred languages* that *need updating*
.
- Translations outside of your preferred languages*
Where *abc* means a hyperlink with abc as anchor text.
The *preferred langauges* link would take you to the page for configuring your preferred languages.
The *untranslated changes* link would show you pages that are more up to date than the current one.
The *need updating* would show you pages that need updating and that you could do something about.
Finally, the *Translations outside of your preferred languages* link would be a kind of catchall that would list all pages needing updates and all pages that have more recent changes, in all languages supported by the site. This might be useful in case the user has not set his language preferences.
Status: AD looked at it on 2008-02-19, and the layout and wording of the Page translation box is not very clear. Nothing to do.
Issue #5: Bug in translation tracking infrastructure?
At some point I changed the French version, but for some reason, English did not show up as needing to be updated from French. I tried to reproduce this problem without success.
Status: AD on 2008-02-19, could not repreoduce this bug. Just leave it for now.
Issue #6: Minor improvements to translation dialog
Eventually, I was able to update the English version from the French. It worked quite nicely actually! I found the French diff on top of the English text box worked quite well, and enabled me to quickly zero in on what needed to be done.
But here are some improvements.
The caption at the top that says:
Edit: Alain Désilets, es
Comparing version 1 with version 5
Comparing version 1 with version 5
Should really say something like this:
Update "Alain Désiets, es" based on "Alain Désilets, fr"
Changes that need to be reproduced in Spanish are highlighted below.
Changes that need to be reproduced in Spanish are highlighted below.
Status: As of 2008-02-19, this is not done. Write a tracker item for it.
Also, the diff is not as good as it could be. For example, I changed a phrase "il a été président de la conférence" to "il a présidé la conférence" to .
This got highlighted as follows: "il a pr*a é*sid*t*é *président de* la conférence", which is hard to read. It would have been better if it had come out as: "il a *présidé* la conférence".
Note that I tried different diff styles available in Tiki, and most of them seem to have this sort of problem except for HTML Diff. Unfortunately, HTMLDIff shows a side-by-side diff and we probably just want an inline diff. Oh well, it will have to do for now.
Status: As of 2008-02-19, this is still the case. Create a tracker item for this.
Comment #7: Should "consecutive" changes be merged into a single translation bit?
Next, I did two changes on the French, in two shots (i.e. two saves).
Then I went to update English based on French, and I was pleased to see that the diff showed me the two changes together.
I was thinking about this issue when reading L-P's translation tracking architecture. It sounds like with this new architecture, different saves would be considered to be different translation bits. So does this mean that the end user would have to translate them in two different transactions? Or would it be better to consolidate "consecutive" translation bits into one bigger translation bit? What does "consecutive" mean in this context?
On the one hand, consolidating consecutive translation bits is good in that it avoids situations where the user needs to do multiple updates. In particular, if the author writes a sentence, saves, then reformulates that sentence, the translator would not have to translate the original sentence and its re-formulation. He would just translate the final version. (Note: translators tend to work in a mode where they write a first draft, then revise it and refine after the fact, so this may be very common).
On the other hand, keeping consecutive translation bits separate has the advantage of reducing granularity of transactions. This is important because in L-P's new architecture, there is an assumption that when a user is updating a page based on another language, the update is considered completed as soon as the user hits the save button. But if the update is a large one, the user may want to click on save midway through translation. If translation bits are kept short, this is less likely to happen.
Status: 2008-02-19. It seems that consecutive content chunks are consolidated into a single translation transaction. This seems to work fine for now.
Issue #8: Copying and pasting from the diff inserts weird end of line characters
A common practice with translators, is to start with the source text and write the translation over it.
When updating the English page based on the French, I found myself naturally copying highlighted content from the French diff, into the English edit text box, and then overwriting it.
But... when I do this, the following caracter: ↵ gets inserted at the end of each line.
Status: As of 2008-02-19, this is still the case. Create a tracker item for it.
Issue #9: Translations should not have to always be done in one go
I think this is a biggie.
If I update the English version based on the French version, and hit the save button before I have completed the translation, the system considers that English and French are now up to date, and there is no more way for me to remember that it's in fact out of date, and what parts need to be updated.
This is a problem I have encountered when I developved multilingual features in LizzyWiki, and I was never able to find a satisfactory solution to it.
The solution I used was that the system kept nagging the user, asking him if the translation was complete or not.
I suggest the following:
Under the save button, there would be a check box saying
Translation completed
This check box would be unchecked by default.
When the user saves a page, there would be some sort of page alignment checking algorithm that would try to guess whether or not the translation is completed.
If the user checked that box, but the system figures that something is fishy, i.e. the translation does NOT look like it's done, it would display a message saying something like this:
The highlighted text below does not seem to have been translated. Are you sure you want to label this translation task as complete?
Conversely, if the user did not check the box, but the system thinks that the translation is done, then it would display something like this:
It looks like you have completed this translation task. Is that the case? Yes | No
I think it shouldn't be too hard to implement this kind of sanity check on translation completion. In fact there are people in my group who have implemented something like that. And it's something that can lead to a cool paper. I'll investigate.
Status: 2008-02-19, Nothing done on that of course, cause it's a biggie. Just create a tracker item for it.
Issue #10: Changes that have already been synchronized keep being highlighted.
Something is terribly wrong with the translation tracking.
I brought the system to a state where it showed En, Fr and Es as all being in synch with each other.
Then I added the following change to the En version
----
This change will first be made in English, then translated to Spanish, then Translated to French. I will then look to see if the system considers that Spanish needs to be updated from French. 2008-02-05@15h34
This change will first be made in English, then translated to Spanish, then Translated to French. I will then look to see if the system considers that Spanish needs to be updated from French. 2008-02-05@15h34
Then I went to Fr version, and did Update from En.
This is what the system highlighted as changes that needed to be brought into Fr
---
This modification was first made in French, 2008-02-05-@14h12
This modification was first made in French, 2008-02-05@14h56.
This sentence is part of the same change.
This modification was first made in French, 2008-02-05@14h57.
This change will first be made in English, then translated to Spanish, then Translated to French. I will then look to see if the system considers that Spanish needs to be updated from French. 2008-02-05@15h34
This modification was first made in French, 2008-02-05-@14h12
This modification was first made in French, 2008-02-05@14h56.
This sentence is part of the same change.
This modification was first made in French, 2008-02-05@14h57.
This change will first be made in English, then translated to Spanish, then Translated to French. I will then look to see if the system considers that Spanish needs to be updated from French. 2008-02-05@15h34
Looks like the translation UI is going back too far in the Fr versions.
L-P says he fixed that bug on his local version, but it's not live yet.
Status: Fixed as of 2008-02-19.
Issue #11: Cannot mix translation and original editing in a same transaction
Say, I make a change to En. Then I update Fr based on En. Say it's a pretty long translation.
Since I'm a wiki-translator, I'm translating content that I care about and that I know a lot about. In fact, I am also a prolific author on that site.
While translating the change, I spot an error in a part of the text that has nothing to do with the translation I am currently doing. Chances are that I will go ahead and fix the error right therre while I'm still in the middle of a translation transaction.
Then I finish the translation, and I click on Save.
Problem is, the system now considers that En and Fr are in synch, eventhough my correction (which I did at the same time as translation) is not in the En version.
I think we need to figure out a way that people can swithc between translator and author modes within a same transaction. This type of scenario has happend to me often. Because I knew about the issue, I was able to be disciplined about it and just write down the change on a piece of paper, and delay doing it until I was done translating. But even there, I sometimes forgot that I wasn't supposed to do that. So for sure, "ordinary" users will have trouble splitting their personality into "translator" and "author" roles.
Again here, a solution could come from some sort of translattion verification feature. If the system sees a change that looks like it was not a translation of a change on the En side, it would signal that to the user, and maybe allow the user to label that change as being an original contribution as opposed to a translation.
Status: AD created a mockup of a constrained translation GUI that might address this issue: Mockup of a constrained translation GUI. Need to create a tracker for it!!!
Issue #12: Translated changes appearing as needing translation
On 2008-02-12, I did an original change to Alain Désilets, English version, i.e. I added this line:
Added in English, 2008-02-12@15h00
Then, I did an original change to the French version Alain Désilets, fr, i.e. I added this line:
Ajouté en français, 2008-02-12@15h02
Note the difference in the time stamp.
The system rightly showed that both those pages needed to be updated from the other.
So I updated the English page based on the French, i.e. I added the following translation on the English side:
Added in English, 2008-02-12@15h02
Then, I updated the French page based on the English.
At that point, I expected the diff to only show the change that was made in English that needed to be reproduced in French, i.e.:
Added in English, 2008-02-12@15h00
But instead, it showed me all the changes that happened in the English page, since the last time when French and English were in Synch. In other words, it showed me both the original change that was made in English, AND the English translation of the original French. In other words, it showed me:
Added in English, 2008-02-12@15h00
Added in English, 2008-02-12@15h02
Added in English, 2008-02-12@15h02
Ideally, it should only have shown the first of those two lines. But this is hard to do technically.
The reason is that the way that Tiki finds the changes to propagate from En to Fr is that it does a diff of the En version, between the most recent version, and the last version where en and Fr were flagged as being in synch.
If we don't want to show diffs that have been done in En in the course of propagating changes from Fr to En, we need to somehow:
- Identify the En versions that were created in the course of a Fr->En translation transaction. For each of those:
- Figure out the diff between that En version and the previous one.
- Remove those diffs from the diff between the current En version and the most recent En version that was in synch with Fr
- This second step is the one that would be really tricky to do.
Status: Nothing done on this as of 2008-02-19, cause it's a tough nut to crack. But create a tracker item for it in the backlog nevertheless so we remember the idea.