User talk:Twarz

Everything, on every topic, without any kind of cleanliness or order. I'll just write here everything that happens / that I do and put it back cleanly somewhere else.

=Documentation=

USING VALGRIND
I'm not yet in the debugging / coding process. Therefore, using valgrind's advanced location options such as --leak-check and --show-reachable are irrelevant, for now. For the first tests, everything will be done using this mere command:

valgrind --tool=memcheck ./soffice.bin

This will allow me to globally locate where are the leaks and the memory errors. Once I'll get Eric's approval, I shall build some libs with debugging symbols and try to fix everything that needs it (with the debugging symbol, each unfreed malloc will be precisely located at a given line in a given file). But for now, let's just start slowly.

$>valgrind --tool=memcheck ./instsetoo_native/unxlngx6.pro/OOo4Kids/installed/install/fr/ooo4kids0.4/program/soffice.bin ==7891== Memcheck, a memory error detector. ==7891== Copyright (C) 2002-2008, and GNU GPL'd, by Julian Seward et al. ==7891== Using LibVEX rev 1884, a library for dynamic binary translation. ==7891== Copyright (C) 2004-2008, and GNU GPL'd, by OpenWorks LLP. ==7891== Using valgrind-3.4.1-Debian, a dynamic binary instrumentation framework. ==7891== Copyright (C) 2000-2008, and GNU GPL'd, by Julian Seward et al.  ==7891== For more details, rerun with: -v ==7891==

Valgrind will basically serve two main purposes: find where are the unfreed mallocs, and where are memory-related errors that COULD lead to a crash.

Let's try a little program to see how valgrind reacts in case of unfreed mallocs.

int main(int ac, char **av) {     char  *str = NULL; int  i = 0; while (i != 10) {           str = malloc(10); i++; }      free(str); return (0); }

Here is what valgrind tells us:

==10181== Memcheck, a memory error detector. ==10181== Copyright (C) 2002-2008, and GNU GPL'd, by Julian Seward et al. ==10181== Using LibVEX rev 1884, a library for dynamic binary translation. ==10181== Copyright (C) 2004-2008, and GNU GPL'd, by OpenWorks LLP. ==10181== Using valgrind-3.4.1-Debian, a dynamic binary instrumentation framework. ==10181== Copyright (C) 2000-2008, and GNU GPL'd, by Julian Seward et al.  ==10181== For more details, rerun with: -v ==10181==   ==10181==    ==10181== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 8 from 1) ==10181== malloc/free: in use at exit: 90 bytes in 9 blocks. ==10181== malloc/free: 10 allocs, 1 frees, 100 bytes allocated. ==10181== For counts of detected errors, rerun with: -v ==10181== searching for pointers to 9 not-freed blocks. ==10181== checked 69,040 bytes. ==10181==   ==10181== LEAK SUMMARY: ==10181==   definitely lost: 90 bytes in 9 blocks. ==10181==     possibly lost: 0 bytes in 0 blocks. ==10181==   still reachable: 0 bytes in 0 blocks. ==10181==        suppressed: 0 bytes in 0 blocks. ==10181== Rerun with --leak-check=full to see details of leaked memory.

So yeah, basically each malloc needs it's free, you can't just call one free per pointer if it has been malloc'd many times. Now on a little program like this, this isn't really hard to find and correct - with something as big as OOo4Kids, this is going to be much harder. Let's take the case of the memory leaks I've detailed here : http://wiki.ooo4kids.org/index.php/KnowIssues.

Using only the --tool=memcheck truly isn't enough to localize the problems, we will have to use some other ptions, like --leak-check=yes and --show-reachable=yes. So let's try this.

valgrind --tool=memcheck --leak-check=yes --show-reachable=yes ./soffice.bin 2> output.txt

Why the 2> ? Because valgrind always write it's reports on STDERR, and those reports can sometime be heavier than one page of the terminal memory. Having it in a file allow me to read everything valgrind tried to tell me, even though the result might be quite large... In this case, the file's size is about 280kb, which is starting to get quite big. But it does give us a lot of information. A bit too much for everything, but I'll at least paste one or two examples.

==11859== 73,648 bytes in 101 blocks are still reachable in loss record 191 of 193 ==11859==   at 0x4C254D0: memalign (vg_replace_malloc.c:460) ==11859==   by 0x4C2558A: posix_memalign (vg_replace_malloc.c:569) ==11859==   by 0x110A3480: (within /usr/lib/libglib-2.0.so.0.2000.1) ==11859==   by 0x110A4D08: g_slice_alloc (in /usr/lib/libglib-2.0.so.0.2000.1) ==11859==   by 0x11061E41: g_array_sized_new (in /usr/lib/libglib-2.0.so.0.2000.1) ==11859==   by 0x110AFDD0: g_static_private_set (in /usr/lib/libglib-2.0.so.0.2000.1) ==11859==   by 0x1106F352: g_get_filename_charsets (in /usr/lib/libglib-2.0.so.0.2000.1) ==11859==   by 0x1106F3BD: (within /usr/lib/libglib-2.0.so.0.2000.1) ==11859==   by 0x110B0054: g_thread_init_glib (in /usr/lib/libglib-2.0.so.0.2000.1) ==11859==   by 0xEB25E74: create_SalInstance (in /home/twarz/OOo/OpenOfficeSrc/m54/DEV300_m54/instsetoo_native/unxlngx6.pro/OOo4Kids/installed/install/fr/OOo4Kids/basis0.4/program/libvclplug_gtklx.so) ==11859==   by 0x7F8A939: (within /home/twarz/OOo/OpenOfficeSrc/m54/DEV300_m54/instsetoo_native/unxlngx6.pro/OOo4Kids/installed/install/fr/OOo4Kids/basis0.4/program/libvcllx.so) ==11859==   by 0x7F8BA77: (within /home/twarz/OOo/OpenOfficeSrc/m54/DEV300_m54/instsetoo_native/unxlngx6.pro/OOo4Kids/installed/install/fr/OOo4Kids/basis0.4/program/libvcllx.so)

==11859== 140,597 bytes in 835 blocks are still reachable in loss record 193 of 193 ==11859==   at 0x4C25684: calloc (vg_replace_malloc.c:397) ==11859==   by 0x1108E9B9: g_malloc0 (in /usr/lib/libglib-2.0.so.0.2000.1) ==11859==   by 0x110AED39: g_thread_self (in /usr/lib/libglib-2.0.so.0.2000.1) ==11859==   by 0x110AFFD7: g_thread_init_glib (in /usr/lib/libglib-2.0.so.0.2000.1) ==11859==   by 0xEB25E74: create_SalInstance (in /home/twarz/OOo/OpenOfficeSrc/m54/DEV300_m54/instsetoo_native/unxlngx6.pro/OOo4Kids/installed/install/fr/OOo4Kids/basis0.4/program/libvclplug_gtklx.so) ==11859==   by 0x7F8A939: (within /home/twarz/OOo/OpenOfficeSrc/m54/DEV300_m54/instsetoo_native/unxlngx6.pro/OOo4Kids/installed/install/fr/OOo4Kids/basis0.4/program/libvcllx.so) ==11859==   by 0x7F8BA77: (within /home/twarz/OOo/OpenOfficeSrc/m54/DEV300_m54/instsetoo_native/unxlngx6.pro/OOo4Kids/installed/install/fr/OOo4Kids/basis0.4/program/libvcllx.so) ==11859==   by 0x7D2319F: InitVCL(com::sun::star::uno::Reference const&) (in /home/twarz/OOo/OpenOfficeSrc/m54/DEV300_m54/instsetoo_native/unxlngx6.pro/OOo4Kids/installed/install/fr/OOo4Kids/basis0.4/program/libvcllx.so) ==11859==   by 0x7D23546: (within /home/twarz/OOo/OpenOfficeSrc/m54/DEV300_m54/instsetoo_native/unxlngx6.pro/OOo4Kids/installed/install/fr/OOo4Kids/basis0.4/program/libvcllx.so) ==11859==   by 0x7D23744: SVMain (in /home/twarz/OOo/OpenOfficeSrc/m54/DEV300_m54/instsetoo_native/unxlngx6.pro/OOo4Kids/installed/install/fr/OOo4Kids/basis0.4/program/libvcllx.so) ==11859==   by 0x527D8DB: soffice_main (in /home/twarz/OOo/OpenOfficeSrc/m54/DEV300_m54/instsetoo_native/unxlngx6.pro/OOo4Kids/installed/install/fr/OOo4Kids/basis0.4/program/libsofficeapp.so) ==11859==   by 0x40126A: main (in /home/twarz/OOo/OpenOfficeSrc/m54/DEV300_m54/instsetoo_native/unxlngx6.pro/OOo4Kids/installed/install/fr/ooo4kids0.4/program/soffice.bin)

So, where are the interesting data here ? Two things to spot : -The first line indicate the gravity of the leak: 73,648 for the first one and 140,597 for the second one. They represent the total of unfreed bytes for calls to a specific malloc. Which means the second one is probably one of the cause of this leak, because the amount leaked is quite huge. As it is provoked by repeating the same action a lot of times, it makes some sense that the same mallocs are responsible. -Each line gives a hint about where is precisely located the leak. Starting from the bottom line, and then going up until the second line, we are able to trace the binary path from the beginning of the program to the memory leak. Unfortunately, debugging symbols were not included in the build, so we still have an imprecise idea of where is what. Moreover, it seems that many libraries are related, which means each of these libraries will probably have to be rebuilt, which will take time and make a much bigger binary in the end.

Twarz 16:47, 6 August 2009 (UTC)

USING GDB
Today, august the 19th, what such a sunny and beautiful day that I decided to spend it working on gdb and gdb onyl, to try and fix this issue (http://wiki.services.openoffice.org/wiki/Education_Project/Effort/Various_adaptations_on_Sugar/Work_In_Progress#Impress:_empty_presentation_.2B_effect_.3D_boom). Let's go.

As every real, big, bad crash, this one was quite easy to locate. The first step, of course, is to define a clear how-to-reproduce step-by-step guide (else, just like me, you will spend most of your time trying to reproduce the bug with the debugger, then with the symbols-included libraries, and it takes loooots of time).

Launch Gdb
Okay, so let's start from scratch. You have OOo4Kids, clean and beautiful, no debugging symbols, nothing. Gdb can still be of use to locate heavy crashes. Let's start.

twarz@Taokaka:~$ gdb GNU gdb 6.8-debian Copyright (C) 2008 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later  This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-linux-gnu". (gdb)

You just launched gdb, but in itself it won't help much. Open a new terminal, launch OOo4Kids. Then, using top or the system monitor for instance, check which process number is associated to OOo4Kids. Let's say: 2514.

(gdb) attach 2514

This will attach (oh, really ?) gdb to the specified process. Here is what happens next:

Attaching to process 27651 Reading symbols from /home/twarz/OOo/OpenOfficeSrc/m55/DEV300_m55/instsetoo_native/unxlngx6.pro/OOo4Kids/installed/install/en-US/ooo4kids0.4/program/soffice.bin...(no debugging symbols found)...done.

Same lines for every library in the $(SRC_ROOT)/instsetoo_native/unxlngx6.pro/OOo4Kids/installed/install/en-US/OOo4Kids/basis0.4/program/ directory. Gdb will read and load every debugging symbols he will find in the libraries.

0x00007f185c991742 in select from /lib/libc.so.6 (gdb)

Type "continue" (or c for shortcut).

Basic use of "attachprocessnumberGdb"
(yes, there are other ways of using gdb, but that's not the question for now)

You have two things at your disposition:

-one terminal running gdb

-OOo4Kids (or any other program).

Both will serve you to debug your program correctly. Well, in fact, if you did not include any debugging symbol, you won't be able to do much, but let's suppose you have. The last thing you typed in gdb's command line is "continue". You now have the entire control over your program, and can do whatever you want. But at some time, you might want to see what the hell is going on right now. Go back to the terminal, and push CTRL+C (it send the SIGINT signal to the process and gives back the control to gdb). You now can check what happens in the source code itself (see examples below), and control your program from the command line. most common commands are:

-p / print variable: displays [variable]'s value on the terminal

-n / next : go to the next line of the code, makgin to program advance at the same time

-u : skip whiles, for...

-s / step: go next line, OR if the current line is a function, continue inside this function

-b / breakpoint function : sets a breakpoint to a specific function. It means that when the control is handed over to the program, it will be automatically be sent back to gdb once the specified function has been reached by the execution of the program. It is pretty convenient, this way you don't have to trace from the very beginning if only one function of your code needs to be debugged (it also mean that you approximatively know where to debug).

-c / continue : you should know it already, gives back the control to the origninal program.

Those tools should allow you to keep an eye at any given time at what's really going on inside your program, and what could possibly go wrong or works just as planned. But, how can you find where you need to debug / place some debugging symbols ? Let's see this in the next section, with an exemple of what I traced today.

One Example
Okay, so we currently have no debugging symbol, and we typed "continue" in gdb. Let's reproduce the crash, and analyse what's the result:

Program received signal SIGSEGV, Segmentation fault. [Switching to Thread 0x7f1863ff6750 (LWP 27651)] 0x00007f18490b44c6 in ?? from /home/twarz/OOo/OpenOfficeSrc/m55/DEV300_m55/instsetoo_native/unxlngx6.pro/OOo4Kids/installed/install/en-US/ooo4kids0.4/program/../basis-link/program/slideshow.uno.so (gdb)

We don't have much indication, this is definitely not going to show us where the problem is... But we have a first hint: the process was, at the time of the crash, executing code located in the slideshow.uno.so library. Which does not contain debugging symbols... yet. Adding them will allow us to receive much more precise information. Let's do it. Open a new terminal, go to the slideshow directory, and build with the symbols.

$> build debug=true

Once done, copy the new library in the instsetoo_native/unxlngx6.pro/OOo4Kids/installed/install/en-US/OOo4Kids/basis0.4/program/ directory. Start gdb and OO4Kids again, and reproduce the crash. Here is the new result:

Program received signal SIGSEGV, Segmentation fault. [Switching to Thread 0x7f161a960750 (LWP 16129)] 0x00007f15f7fd4478 in slideshow::internal::EventMultiplexerImpl::forEachView const&)> (this=0x7f15f85aaae8, pViewMethod=&virtual table offset 104) at /home/twarz/OOo/OpenOfficeSrc/m55/DEV300_m55/slideshow/source/engine/eventmultiplexer.cxx:451 451                ((*aIter)->getUnoView.get->*pViewMethod)( mxListener.get );

Now this is much more helpful. We get the file, we get the line... Almost everything needed to fix the crash ! Gdb stopping here means that the process stopped when it was trying to execute this particular line. Now you just have to search the problem, modify it, recompile and try to see if your guess was correct.

In this particular case, correcting it was pretty easy (one signle verification to be done), but it might be much harder in other case. I recommand you use the gdb command "bt" once the execution has stopped. It will provide every function the process has been through before reaching it's stop point. You might want to check in these for the location of your problem...

Twarz 17:46, 19 August 2009 (UTC)

=Work on XOs=

Quick Introduction
After receiving the XOs, Kidd and myself will be able to engage into the most interesting part of our training period: adapting OOo4Kids on Sugar, and being able to run it on the XO laptops. We will basically proceed as follow: experiment, locate, correct, meaning we will spend a lot of time on merely using OOo4Kids, trying to find bugs and / or feature to optimize, then use the tools at our disposition (gdb, valgrind, callgrind...) to locate them precisely in the code, then with Eric Bachard's and Pierre Pasteau's help do our best to correct and suggest a working patch that will be submitted to the next version of OOo4Kids and maybe even OOo. Our objectives are as follow:

allows it);
 * submit at least 2 visible performance improvement (one per student, possibly more if time


 * submit at least 2 UI improvement (just as above, at least one per student);

Memory
XOs are laptops with a really low amount of RAM. Therefore, one of the major objectives will be to make sure that the memory leaks are close to none. If not, OOo4Kids might be unusable on the laptops. We will probably be using mainly Valgrind in order to locate and correct the biggest leaks. Many tests will be carried out:
 * letting the program run for a long time, with a huge amount of information provided;
 * filling an Impress wizard with a lot of information, then cancel (this test has been proved positive on the current version of OOo4Kids and will require special attention);
 * deleting figures, text, drawings... during a session, and make sure the memory that was used is correctly freed;
 * Check previews and slideshows (Impress' wizard);
 * checking the limit of the documents and try to expand it by suppressing unused memory;
 * simply checking a normal, basic usage of the different applications
 * secondary: location and termination of eventuals invalid frees/delete

The idea is to see how much memory OOo4Kids is using at the very beginning, when displaying the Start Center, when the user hasn't done anything. Then, carry on the tests, cancel everything, and come back to the Start Center. If the amount of memory used by OOo4Kids is different at that time, then there is a memory leak.

Here is how we will proceed to locate the memory leaks once they have been recognized as such (this is also a more precise way to see if there is indeed a memory leak). We will be using Valgrind.

First, launch OOo4Kids with Valgrind, reach the Start Center, and quit immediatly. Here is the result:

==578== ERROR SUMMARY: 2 errors from 2 contexts (suppressed: 1418 from 3) ==578== malloc/free: in use at exit: 2,240,486 bytes in 10,319 blocks. ==578== malloc/free: 41,644 allocs, 31,326 frees, 10,481,760 bytes allocated. ==578== For counts of detected errors, rerun with: -v ==578== Use --track-origins=yes to see where uninitialised values come from ==578== searching for pointers to 10,319 not-freed blocks. ==578== checked 22,363,496 bytes. ==578==  ==578== LEAK SUMMARY: ==578==   definitely lost: 13,796 bytes in 327 blocks. ==578==     possibly lost: 122,753 bytes in 89 blocks. ==578==   still reachable: 2,103,937 bytes in 9,903 blocks. ==578==        suppressed: 0 bytes in 0 blocks.

As we can see, there are already some unfreed allocations, but we will overlook them for now. We can now start carrying on the tests, using these values as base value. The idea is that there should always be the same difference of allocs-free - this would mean the action we just done was entirely and correctly freed, meaning the remaining "leaks" are only part of the base value. If the difference is greater, it means the test is positive, and all the memory we just used hasn't been correctly freed. It can be dangerous: if we use OOo4Kids for too long, the used memory will become greater and greater, and cause a crash. Let's try to merely open a Writer document, come back to the Start Center, and quit OOo4Kids. Here is the Valgrind log:

==32656== ERROR SUMMARY: 2 errors from 2 contexts (suppressed: 2299 from 4) ==32656== malloc/free: in use at exit: 2,374,725 bytes in 10,830 blocks. ==32656== malloc/free: 64,424 allocs, 53,595 frees, 23,395,607 bytes allocated. ==32656== For counts of detected errors, rerun with: -v ==32656== Use --track-origins=yes to see where uninitialised values come from ==32656== searching for pointers to 10,830 not-freed blocks. ==32656== checked 29,136,480 bytes. ==32656==  ==32656== LEAK SUMMARY: ==32656==   definitely lost: 13,796 bytes in 327 blocks. ==32656==     possibly lost: 136,449 bytes in 97 blocks. ==32656==   still reachable: 2,224,480 bytes in 10,406 blocks. ==32656==        suppressed: 0 bytes in 0 blocks.

Base value: 41644-31326 = 10318

New value: 64424-53595 = 10829

There IS a memory leak. The check the gravity of the leak, we just have to take a look at the "Leak Summary".

Base Value: 13 796 + 122 753 + 2 103 937 = 2 240 448 bytes.

New Value: 13 796 + 136 449 + 2 224 480 = 2 374 725 bytes.

Meaning, there is a leak of 134 277 bytes. It's not too much, this leak is not an urgent matter, but it still represent a minor risk.

This is basically how to proceed with finding a memory leak in the program. To locate it it the code, we will have to use some options of Valgrind and add the debugging symbols to the libraries concerned in the leak.

CPU
Just as for the memory, the XOs laptops have limited ressources in term of CPU. This will be our second important objective: globally optimize OOo4Kids and get rid of any situation that could lead to a 100% use of the CPU.As for the memory leaks, many tests will be made:
 * opening a high number of documents at once, filled with informations;
 * letting the program run alone during different phases of it's execution;
 * fill the dialogue boxes in the wizards (positive on the current version in the Impress wizard) and try to fix the huge slowdown that might result of it;
 * give a special attention to the Impress application which seems to be prone to crashing and is globally under-optimized;

Controlling the CPU usage will be done with two tools on the XO: the shell command "top", that shows a high amount of information about every process being run (including, of course, the % of CPU the process uses), and gdb. How ? Pretty simple. First, run OOo4Kids normally, and find some conditions that will lead to a 100% CPU usage. Then, using gdb, repeat the same condition and stop the program at the right time. This will give us a first indication of where the problem is inside the code. At this point, we will have to read the code and understand what is going on, understand what the code does, and try to find a way to make it do it faster.

Improving UI


As precised in the introduction, one of the goal of this training period is to make us develop improvements and new elements in the UI, or improving existing elements.



One of the most likely addition that will be made is a new wizard for the Writer application which for now have none. Based on the Impress wizard, it will suggest at most three pages for the user to personify the document he is about to create. It will therefore be slightly simplified compared to the Impress wizard. It will be created using the same portion of code, for the greatest part located in the file sd/source/ui/dlg/dlgass.cxx

Promoting the XO
Receiving the XO will also allow us to assist Eric Bachard during a conference that will take place in october. Many points will be presented during this conference: the Education project, in order to make OOo4Kids more known of the public, our own project of contributing to the source code, and of course a presentation of the XO, their capacities, their goal, and an overall view of the OLPC fundation.

What is the XO?
XO 's laptops are an initiative of the OLPC (One Laptop Per Children) association. They are low-cost laptops (available for poor children) with low performances using Sugar (a Linux using neither KDE nor Gnome but a new type of interface). There are three type of XO, the XO-1, the XO-1.5 and the XO-2.

The XO-1 is using an AMD Geode LX-700@0.8W processor with a CPU frequency of 433MHz, and an AMD CS5536 South Bridge chipset. It has 256Mo of RAM and 1GO of flash memory. The XO-1.5 is a XO-1 update as the processor is now a VIA C7-M with a CPU frequency of 1GHz and 4GO of flash memory (and some other things).

The XO-2 really changes the laptop's shape, as there is no more keyboard but a double touch-sreen that will allow the development of whole new activities.



What has been done - how to check our progress
For now, the work on the XOs hasn't begun because we lack the laptops, that should be coming soon. Everything related to our own project as student is available on the OOo wiki at this page. A blog has been put in place, and will be used mainly to share our progress on the XOs. Few documentation about the use of GDB and Valgrind is available here.

How we will process to improve OOo4Kids performances
In order to improve the performances of OOo4Kids, we have to deal with some rules to make our research re-usable and trustworthy in a scientist way.


 * First, we will have to put on some "base values": the amount of CPU that is used or the memory used by OOo4Kids in normal times (like when we launch Start Center).


 * Then we will make a test and cancel everything and then see the difference between these values.


 * Then, thanks to this test, we will have a better idea of where an optimisation could be done. We will write all that is needed to be written (like how to have the exact same conditions, all the data to be compared) and we will try to optimize it step by step comparing with what was written down. We will go through it with some tests and some brainstorming.