Tutorial 4: Dictionary Export and Import

Export and import functionality allows you to move all Chinese Toolbox FREE data out of and into the program respectively. There’s a number of reasons why you might want to utilize this functionality.

  1. You could export the data, make a copy, and maintain two files, one for your character understanding data (e.g. UnderstandingData.txt), and one for the character dictionary.
  2. If you want to move your understanding data to another computer, you need to export the data, copy CharacterDictionary.txt to the Chinese Toolbox document directory of the second computer, then import the data into Chinese Toolbox FREE on your second computer.
  3. If your school or university already has a Chinese dictionary stored in a database or other file, that data could be incorporated into CharacterDictionary.txt, or a new CharacterDictionary.txt file could be created from the custom or university dictionary. The new CharacterDictionary.txt file would just need to be in a format that Chinese Toolbox FREE recognizes.
  4. The previous step could be extended to incorporate a non-English dictionary into Chinese Toolbox FREE.

If you’re already familiar with concepts discussed in this tutorial, you may not need to read all of this. To save you a little time, here is a summary of the main points:

  1. If you simply want to use your Chinese Toolbox FREE data on a second computer (i.e. you don’t need to edit it), you don’t necessarily need to export from original computer and import to second computer. You can just copy the data files in the Chinese Toolbox document directory to the Chinese Toolbox document directory on the second computer. These data files are named ChineseToolbox2.dat and ChineseToolbox.cfg.
  2. Chinese Toolbox FREE data falls into two main categories: dictionary data and understanding data. In the current version of Chinese Toolbox FREE, you cannot specify what is to be exported — everything is exported.
  3. The program writes the export to CharacterDictionary.txt in the Chinese Toolbox documents directory.
  4. The import reads from the same file.
  5. CharacterDictionary.txt is written in Unicode UTF-8 format. If you edit this file, you must save it in UTF-8 format before attempting to import it back into Chinese Toolbox FREE. The import will fail if the file is written in UTF-16 or any other format.
  6. When you edit the file, you can remove any columns except the first, the one with the header ctsf_CHAR_SINGLE.
  7. When you edit the file, you can remove any rows except the first. This first row contains the column headers which are used to associate the column data with target frames in Chinese Toolbox FREE.
  8. Columns can be reordered, and rows can be sorted on any column. Just be sure in the sort to specify that the file does have a header row (the header row is not to be sorted with the data).
  9. By removing data (rows and/or columns) from CharacterDictionary.txt, you can control what is imported back into Chinese Toolbox FREE. For example, if you remove all the character dictionary columns from the file, leaving only the understand data, then the Chinese Toolbox FREE built-in dictionary will not be affected when you import the data back into Chinese Toolbox FREE. Only the understanding data in the program is updated.

If any of this is unclear, proceed with the remainder of this tutorial. Clicking on any of the screen shots below will display the full-size image in your browser.

First, the export functionality:

        01s_ExportMenu

Exporting is the easy part. Just point to and click on the “Export character dictionary” menu item under the File menu. The export process takes only a few seconds. When finished, a new file will exist in the Chinese Toolbox documents directory. On my computer (running Vista Home Basic), this file appears in the c:\Users\atsherrill\Documents\Chinese Toolbox\ directory. The screen shot below shows the new CharacterDictionary.txt file.

Note that the program will always write the export data to a CharacterDictionary.txt file in the Chinese Toolbox documents directory. If this file already exists, the next export will overwrite the original CharacterDictionary.txt without any warning. So before you export a second time, rename any existing export file to something other than CharacterDictionary.txt.

In this tutorial, I use Microsoft Excel 2003 to remove columns containing dictionary data only, leaving only “character understanding” and “need to learn” data. After saving the new file in the proper format, it can be moved to another computer. However, Excel 2003 cannot write text files in UTF-8 format. The only Unicode format supported by Excel 2003 is UTF-16. So after saving the file in Excel, I’ll need to use another program to convert it to UTF-8 format. Let’s get started.

First, click on the Open menu item under the Excel File menu. The Excel Open dialog will appear. Select the CharacterDictionary.txt file in the Chinese Toolbox document directory as shown below, and click on the Open button.

The Excel import wizard will display three dialogs, one after the other. In the first, select “Delimited” for the file type and “65001 : Unicode (UTF-8)” for the file origin, as shown below:

At the second import wizard dialog (below), just click the Next button. The “Tab” checkbox should already be checked. If it isn’t, check it.

At the third import wizard dialog (below), you likewise shouldn’t need to make any changes. Just click on the Finish button.

After a few seconds, Excel will display the dictionary file in its window. After a little formatting (wrapping text and widening columns), the Excel window appears as follows: 

At this point the columns unrelated to your understanding of characters can be removed. You should be left with the following:

Now save this file in Unicode format. Select “Save As” from the Excel File menu. The following will appear:

Click the Save button, and the following dialog will appear requesting overwrite confirmation.

Click the “Yes” button, then at the following dialog click “Yes” to confirm writing of the file in Unicode format.

At this point you can close Excel. When you do so, Excel will present the following dialog:

This appears because the file has not been saved in the native Excel spreadsheet format. At this dialog, click “No” to confirm that you do not want to save the file again.

CharacterDictionary.txt now contains character understanding data and need to learn data from the original Chinese Toolbox FREE dictionary. However, the file is not yet in the proper format. It still needs to be converted from one Unicode format to another, that is, from UTF-16 to UTF-8. A number of free Unicode text editors are available from various download sites.  Two that work for me are BabelPad and TxtEdit. At the time of this writing, the BabelStone web site was down. Just do a search; BabelPad is available from several software download sites.

After you convert the file to UTF-8 format, you’re ready to copy the file to another computer running Chinese Toolbox FREE. Just place the file, CharacterDictionary.txt, in the Chinese Toolbox documents directory on the second computer. Start up Chinese Toolbox FREE, and click on File, then “Import character dictionary. The program will automatically shut down when the import is complete.  Start Chinese Toolbox FREE again to begin using it with the updated character understanding data.