Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NAND/timing fixes #52

Merged
merged 5 commits into from
Jul 24, 2015
Merged

NAND/timing fixes #52

merged 5 commits into from
Jul 24, 2015

Conversation

aejsmith
Copy link

Here are a bunch of fixes/workarounds for various issues

Alex Smith added 3 commits July 22, 2015 13:12
The Ethernet requires the RD/WE pins to be configured, plus its chip
select pin, CS6. These already get configured correctly at boot by
U-Boot so this doesn't actually fix any issue, just add them in for
the sake of correctness.

Signed-off-by: Alex Smith <alex.smith@imgtec.com>
No need to read the OOB first, reading the page data followed by the OOB
works fine. Reading the OOB first requires an extra read command cycle,
which we can avoid.

Signed-off-by: Alex Smith <alex.smith@imgtec.com>
If nand_wait_ready() times out, this is silently ignored, and its
caller will then proceed to read from/write to the chip before it is
ready. This can potentially result in corruption with no indication as
to why.

While a 20ms timeout seems like it should be plenty enough, certain
behaviour can cause it to timeout much earlier than expected. The
situation which prompted this change was that CPU 0, which is
responsible for updating jiffies, was holding interrupts disabled
for a fairly long time while writing to the console during a printk,
causing several jiffies updates to be delayed. If CPU 1 happens to
enter the timeout loop in nand_wait_ready() just before CPU 0 re-
enables interrupts and updates jiffies, CPU 1 will immediately time
out when the delayed jiffies updates are made. The result of this is
that nand_wait_ready() actually waits less time than the NAND chip
would normally take to be ready, and then read_page() proceeds to
read out bad data from the chip.

The situation described above may seem unlikely, but in fact it can be
reproduced almost every boot on the MIPS Creator Ci20.

Debugging this was made more difficult by the misleading comment above
nand_wait_ready() stating "The timeout is caught later" - no timeout
was ever reported so I did not initially think that this would be the
cause of the problem.

Therefore, this patch increases the timeout to 200ms. This should be
enough to cover cases where jiffies updates get delayed. Additionally,
add a pr_warn() when a timeout does occur so that it is easier to
pinpoint any problems in future caused by the chip not becoming ready.

Signed-off-by: Alex Smith <alex.smith@imgtec.com>
Alex Smith added 2 commits July 22, 2015 13:25
Commit 600e7a2 sets a bit in Config7 which stops the XBurst core
from special-casing short loops to avoid branch target buffer lookups.
The default behaviour vastly slows down tight loops and thus results
in a low BogoMIPS/loops_per_jiffy value, which is used for the *delay()
functions. Setting this bit results in a higher BogoMIPS/loops_per_jiffy
value.

However, even though that bit also gets set on the second core, for
reasons I cannot figure out, it does not appear to take effect until
later on. The result is that when the delay calibration is run on the
second core it will calculate a low value for loops_per_jiffy (and at
that point delays using the calibrated value will delay for the correct
amount of time), but later on delays will delay for far too short a time.
This can be observed with udelay_test: using taskset to restrict
udelay_test.sh to core 0 results in all tests passing, on core 1 however
all tests fail.

Ingenic's kernel does not suffer from this problem yet I cannot see why,
they set the short loop BTB lookup flag in the same place we do, but
their kernel correctly calibrates the delay loop.

So, for now, as a workaround, copy the loops_per_jiffy value from core 0
to core 1. With this, both cores are able to pass udelay_test.

Signed-off-by: Alex Smith <alex.smith@imgtec.com>
… IPIs

The majority of SMP platforms handle their IPIs through do_IRQ()
which calls irq_{enter/exit}(). When a call function IPI is received,
smp_call_function_interrupt() is called which also calls
irq_{enter,exit}(), meaning irq_count is raised twice.

When tick broadcasting is used (which is implemented via a call
function IPI), this incorrectly causes all CPU idle time on the core
receiving broadcast ticks to be accounted as time spent servicing
IRQs, as account_process_tick() will account as such if irq_count is
greater than 1. This results in 100% CPU usage being reported on a
core which receives its ticks via broadcast.

This patch removes the SMP smp_call_function_interrupt() wrapper which
calls irq_{enter,exit}(). Platforms which handle their IPIs through
do_IRQ() now call generic_smp_call_function_interrupt() directly to
avoid incrementing irq_count a second time. Platforms which don't
(loongson, sgi-ip27, sibyte) call generic_smp_call_function_interrupt()
wrapped in irq_{enter,exit}().

Signed-off-by: Alex Smith <alex.smith@imgtec.com>
ZubairLK added a commit that referenced this pull request Jul 24, 2015
NAND + timing + other smp fixes
@ZubairLK ZubairLK merged commit 351e3a7 into MIPS:ci20-v3.18 Jul 24, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants