2012-02-13

[IPNC][DM368] BBT problem after patch to new ECC layout (Bad block table not found for chip 0))

This is follow-up section of another NAND ECC topic  NAND ECC issue - ECC layout incompatible between RBL/UBL and U-Boot/Linux

After patching new ECC table, we found that "Bad block table not found for chip 0" keeps showing even if "Bad block table written to 0x07ffc000" follows every time boot up. It means that the BBT is not written correctly, in both u-boot and linux kernel.

After (painfully) tracing ECC and BBT relevant source code in u-boot, I found the root cause is the incorrect location of writing BBT/mirror patterns.

1. look at following illustration which capture from TI's wiki (http://processors.wiki.ti.com/index.php/DM365_Nand_ECC_layout)


in the file drivers/mtd/nand/nand_bbt.c, there are two nand_bbt_descr structure, bbt_main_descr and bbt_mirror_descr which contain the information how bbt is placed and searched.

originally, the structures are like below (2010-12 u-boot release, I compared with the latest one,Dec 2011, and there isn't meaningful difference regarding ECC and BBT).
===

static uint8_t bbt_pattern[] = {'B', 'b', 't', '0' };
static uint8_t mirror_pattern[] = {'1', 't', 'b', 'B' };
static struct nand_bbt_descr bbt_main_descr = {
.options = NAND_BBT_LASTBLOCK | NAND_BBT_CREATE | NAND_BBT_WRITE
| NAND_BBT_2BIT | NAND_BBT_VERSION | NAND_BBT_PERCHIP,
.offs = 8,
.len = 4,
.veroffs = 12,
.maxblocks = 4,
.pattern = bbt_pattern
};
static struct nand_bbt_descr bbt_mirror_descr = {
.options = NAND_BBT_LASTBLOCK | NAND_BBT_CREATE | NAND_BBT_WRITE
| NAND_BBT_2BIT | NAND_BBT_VERSION | NAND_BBT_PERCHIP,
.offs = 8,
.len = 4,
.veroffs = 12, .maxblocks = 4,
.pattern = mirror_pattern
};


===

2. in check_pattern()@nand_bbt.c, we know that the structure member
offs - the offset from start of oob of block, to place the bbt_pattern
len - the length of bbt pattern
veroffs - the offset from the start of oob of block, to place the bbt version number

Apparently the setting was wrong after deploying the new ECC layout (the upper table in the picture) because the bbt pattern, starting from the 8th byte of OOB, is conflict with ECC.

I change it to 16th and veroffs at 32th, and BBT is back again!  The corresponding portion in Linux source should be changed too. In appro's ipnc release, the change can be made at ti-davinci/arch/arm/mach-davinci/board-dm368-ipnc.c

[DM368][IPNC] NAND ECC issue - ECC layout incompatible between RBL/UBL and U-Boot/Linux

There is a big trouble when we need to update u-boot from u-boot because of some fatal bug in u-boot GPIO configuration. I was surprised that TI doesn't synchronize ECC layout between RBL/UBL and U-Boot/Linux.

The result is that even we can write the new U-Boot from the existing one, UBL decided those modified NAND blocks invalid due to ECC inconsist so the new U-Boot image won't be loaded. Same situation occurs when we want to update UBL from U-Boot.

Fortunately, TI does provide solution to sync ECC layout at wiki (http://processors.wiki.ti.com/index.php/DM365_Nand_ECC_layout)
The steps are fairly simple as below:
1. patch u-boot code
===

diff --git a/drivers/mtd/nand/davinci_nand.c b/drivers/mtd/nand/davinci_nand.c
index 4ca738e..4ba12ce 100644
--- a/drivers/mtd/nand/davinci_nand.c
+++ b/drivers/mtd/nand/davinci_nand.c
@@ -278,5 +278,13 @@ static int nand_davinci_correct_data(struct mtd_info *mtd, u_char *dat,
 static struct nand_ecclayout nand_davinci_4bit_layout_oobfirst = {
 #if defined(CONFIG_SYS_NAND_PAGE_2K)
  .eccbytes = 40,
+        .eccpos = {6, 7, 8, 9, 10, 11, 12, 13, 14, 15,
+                   22, 23, 24, 25, 26, 27, 28, 29, 30, 31,
+                   38, 39, 40, 41, 42, 43, 44, 45, 46, 47,
+                   54, 55, 56, 57, 58, 59, 60, 61, 62, 63,
+                   },
+        .oobfree = {{2, 4}, {16, 6}, {32, 6}, {48, 6}},
+#if 0
  .eccpos = {
   24, 25, 26, 27, 28,
   29, 30, 31, 32, 33, 34, 35, 36, 37, 38,
@@ -288,6 +296,7 @@ static struct nand_ecclayout nand_davinci_4bit_layout_oobfirst = {
  .oobfree = {
   {.offset = 2, .length = 22, },
  },
+#endif
 #elif defined(CONFIG_SYS_NAND_PAGE_4K)
  .eccbytes = 80,
  .eccpos = {

====

2. patch Linux kernel
===
diff --git a/arch/arm/mach-davinci/board-dm365-evm.c b/arch/arm/mach-davinci/board-dm365-evm.c
index 5d3946e..4bce9db 100644
--- a/arch/arm/mach-davinci/board-dm365-evm.c
+++ b/arch/arm/mach-davinci/board-dm365-evm.c
@@ -89,5 +89,17 @@ static struct mtd_partition davinci_nand_partitions[] = {
  /* two blocks with bad block table (and mirror) at the end */
 };
 
+static struct nand_ecclayout dm365_evm_nand_ecclayout = {
+ .eccbytes = 40,
+ .eccpos  = {6, 7, 8, 9, 10, 11, 12, 13, 14, 15,
+     22, 23, 24, 25, 26, 27, 28, 29, 30, 31,
+     38, 39, 40, 41, 42, 43, 44, 45, 46, 47,
+     54, 55, 56, 57, 58, 59, 60, 61, 62, 63,
+ },
+ .oobfree = {{2, 4}, {16, 6}, {32, 6}, {48, 6} },
+};
+
 static struct davinci_nand_pdata davinci_nand_data = {
  .mask_chipsel  = BIT(14),
  .parts   = davinci_nand_partitions,
@@ -96,6 +108,7 @@ static struct davinci_nand_pdata davinci_nand_data = {
  .ecc_mode  = NAND_ECC_HW,
  .options  = NAND_USE_FLASH_BBT,
  .ecc_bits  = 4,
+ .ecclayout  = &dm365_evm_nand_ecclayout,
 };
 
 static struct resource davinci_nand_resources[] = {
diff --git a/arch/arm/mach-davinci/include/mach/nand.h b/arch/arm/mach-davinci/include/mach/nand.h
index b2ad809..7c6be2b 100644
--- a/arch/arm/mach-davinci/include/mach/nand.h
+++ b/arch/arm/mach-davinci/include/mach/nand.h
@@ -83,6 +83,9 @@ struct davinci_nand_pdata {  /* platform_data */
  /* Main and mirror bbt descriptor overrides */
  struct nand_bbt_descr *bbt_td;
  struct nand_bbt_descr *bbt_md;
+
+    /*Nand ECC layout*/
+    struct nand_ecclayout   *ecclayout;
 };
 
 #endif /* __ARCH_ARM_DAVINCI_NAND_H */
diff --git a/drivers/mtd/nand/davinci_nand.c b/drivers/mtd/nand/davinci_nand.c
index 06ee8c8..4e4ed73 100644
--- a/drivers/mtd/nand/davinci_nand.c
+++ b/drivers/mtd/nand/davinci_nand.c
@@ -762,7 +762,10 @@ static int __init nand_davinci_probe(struct platform_device *pdev)
    goto syndrome_done;
   }
   if (chunks == 4) {
-   info->ecclayout = hwecc4_2048;
+   if (pdata->ecclayout != NULL)
+    info->ecclayout = *(pdata->ecclayout);
+   else
+    info->ecclayout = hwecc4_2048;
    info->chip.ecc.mode = NAND_ECC_HW_OOB_FIRST;
    goto syndrome_done;
   }

===

3. Update U-Boot. If we got the UART or SD boot reserved, it would be easy to just update the new U-Boot image. We were not that lucky in this case so there is chicken-n-egg problem. First, we think it is feasible to load the new U-Boot image to a different memory address and jump to this address to program new U-Boot. 

To make the new U-Boot image resides at different memory address (instead of default 0x81080000), we change the setting at (u-boot root)/board/davinci/dm365evm/config.mk, and got a new (ECC patched) u-boot image say uboot.82
===
CONFIG_SYS_TEXT_BASE = 0x82000000
===
Then we load uboot.82 to ram address 0x82000000 using tftpboot or load from USB. "go 0x82000000" can  execute new u-boot image with writing new ECC layout capability.

4. If step 3 failed, because of some reasons (We did fail because of incorrect DRAM setting so memory copy sometimes fail), there is another hack to fix this problem - we replaced the ECC layout table at RAM of current U-Boot!


Thanks that the size of ECC layout structure (static struct nand_ecclayout) remains the same so the replacement can be done by just modifying the content of the old table in RAM. By searching pattern 0x00000006, 0x00000007... starting from 0x81080000, there is only one occurrence in RAM. After carefully calculating the offset, it's easy to change it to new layout based on Step 1 description. Now, Great!, we can not only do U-Boot update but also UBL from current U-Boot!


Followups


There is side effect after patching with the new ECC table. you can find that the BBT(Bad Block Table) cannot be found and updated. I will use another post to discuss and fix this issue.
[IPNC][DM368] BBT problem after patch to new ECC layout (Bad block table not found for chip 0)


Note:
to update u-boot from u-boot, we have to prepare image header for the new uboot image in RAM.
using RAM location 0x82000000 as example, the steps are
1. clear ram buffer

mw.l 0x82000000 0xffffffff 0x60000

2. prepare header (at 1st page, 2k in my case)


mw.l 0x82000000 0xa1aced66
mw.l 0x82000004 0x81080000
mw.l 0x82000008 0x000000a0
mw.l 0x8200000c 0x0000000a
mw.l 0x82000010 0x00000001
mw.l 0x82000014 0x81080000


the definition reference:
typedef struct _NANDBOOT_HEADER_
{ 
 Uint32 magicNum;            // Expected magic number 
 Uint32 entryPoint;          // Entry point of the user application 
 Uint32 numPage;             // Number of pages where boot loader is stored 
 Uint32 block;               // Starting block number where User boot loader is stored
 Uint32 page;                // Starting page number where boot-loader is stored 
 Uint32 ldAddress;           // Starting RAM address where image is to copied - XIP Mode 
 Uint32 forceContigImage;    // Force blocks to be contiguous (used for UBL image since RBL can't deal with skipping bad blocks) 
 Uint32 startBlock;          // Starting block number to attempt placing copies of image 
 Uint32 endBlock;            // Ending block number to stop copy placement 
}
NANDBOOT_HeaderObj,*NANDBOOT_HeaderHandle;


3. load uboot (with new ecc layout, start at 0x81080000) to ram

tftpboot 0x82000800 uboot.newecc


4. write to NAND (where 0x140000 is starting block in my case)
nand write 0x82000000 0x140000 0x60000