Wednesday, November 19, 2008

kernel upgrade to 2.6.27 for mips/arm platform

I have encountered other interesting issue when I was upgrading the kernel from 2.6.23 to 2.6.27 for our MIPS32 platform.

1. the timer, as in 2.6.24 or later, the default MIPS timer has been separated from the old timer code. For our architecture used the default cp0 comparison based timer, we need to enable the r4k timer to get the timer working. I spend couple of days to understand this change. The start_kernel function was able to proceed after the proper is enabled. However, we do have a timer block in our chip, and we can set the timer to a certain frequency as an individual timer source. In my later debugging process, I have enabled the timer block and use our own timer source. It also works.

2. the cache should be disabled when kernel started and re-enable later on. In 2.6.23, the cache was enabled by default. However, in 2.6.27 or some version later than 2.6.23, the cache was configurable by a kernel option "cca=" and it is disabled by default. This change really hurts me. As there are so many changes from 23 to 27 kernel, it is almost impossible for me to notice this change at first. What I have observed at first was the slowness of the system. The BogoMips dropped from about 273 to 3, which is unbelievable. I was doubting the correctness of the timer function at the beginning. I scrutinized the code and well studied the new timer implementation. I even implemented our own timer by using the timer block in our chip. Those doesn't help either. The system was able to boot to busybox but it is really slow. I accidentally tried to use our performance counter program to measure the performance. The performance counter reported the cache hit is 0, which means that there is no cache enabled. I checked our private i/d cache register and they seems enabled. However, I forget to check the setting of the cp0 register of MIPS. There is another setting to enable/disable cache policy. I used a very stupid and old method to pinpoint the problem. I added NOP test to both 23 and 27 kernels. In 23 kernel, when the cache is enabled, the NOP test gives much lower CPI (clock per instruction), otherwise the CPI is high. In 27 kernel, the CPI doesn't change. I tried to figure out the exact point where the CPI drop 23 kernel and check the corresponding code in 27 kernel. I finally found that the default cache policy was disable in 27 kernel, while it is enabled in 23 kernel. By adding the "cca=3" kernel command line option, everything backs to normal, BogoMips, kernel boots properly.


3. Export symbol and export symbol gpl'ed. If your driver, kernel module used the latter symbols, your driver/kernel module must be gpl'ed. This can cause problem for us as we don't want to open source all our kernel modules, especially wlan driver. We deliver binary kernel module for our wlan drivers.