Hi Peter: 于 2012年02月13日 23:09, Peter Barada 写道: >> > Here is my question: >> > 1. Is my patch wrong? >> > 2. Why the official yaffs2 code assume 3 chunkErrorStrike to >> > retire a block? Reduce to 1 chunkErrorStrike will wrongly >> > mark the good block bad? >> > 3. Should I remove the patch? >> > >> > Thanks a lot for your advice. > Yes, your patch is wrong as any read error will retire the block. > > If you see bit-flips from data read out of MTD, then your NAND driver > isn't properly using ECC to correct the data. If MTD used ECC to > correct the data you would see a -EUCLEAN return from MTD on read which > will percolate through yaffs_HandleChunkError() - and increment the > strike count. Thanks for your reply. Now I know patch is wrong. I've read the samsung nand chip data sheet and anylyse the kernel log. I think so many blocks struck out are produced by errors in write operation. But it's very strange why those block went into program error state. According to chip datasheet, if program operation results in an error, map out the block including the page in error and copy the target data to another block. Then it's reasonable for yaffs to retire the block in yaffs_HandleWriteChunkError even if chunk Error Strike count only be one. But why so many program errors? Any ideas? In addition, I used hardware ECC in MTD driver, the error correcting code is hamming code. The nand chip is MLC mode, so hardware ECC can't correct multi bit error and mtd return read error to yaffs, this may increase the number or blocks struck out. I wondered how yaffs handle the uncorrectable bit error in order to keep filesytem data reliability and integrality. If yaffs2 key data read from nand is error in some bits, how can yaffs2 work without crash? Thanks again. Regards, Xueqin Chen