The i18n Cookbook - recipies for a global society

  • java cookbook
  • about the author
Home › Java Internationalization Cookbook › Unicode, Transliteration, and Charactersets

Java Cookbook

  • Java Internationalization Cookbook
    • Locales
    • Dates and Times
    • Numerical Systems
    • Misc
    • Resource Bundles
    • Unicode, Transliteration, and Charactersets
      • Convert text from one script to another
      • Detect the Charset of a URL
      • Get Transliterators available source ids
      • Get all available transliterator ids
      • Get available target ids for a Transliterator source id
      • Read a Unicode file
      • Write a Shift_JIS Japanese file

Read a Unicode file

Problem:

You want to read a Unicode encoded file into memory.

Solution:

To read a file with a charset other than the system default you should specify the charset in the reader.  You can specify either a Charset or the String id of the Charset.

To read a Unicode file containing Japanese characters:

//Surround with try catch to handle potential exception
try{
    //Get an InputStream
    FileInputStream fis = new FileInputStream("C:\\files\\test.txt");
    //Get a reader specifying the charset
    InputStreamReader isr = new InputStreamReader(fis,"UTF-8");
    //wrap with a buffered reader for performance
    BufferedReader br = new BufferedReader(isr);
    //Read it into a variable an output
    String txt;
    while((txt = br.readLine()) != null){
        System.out.println(txt);
    }
}catch (FileNotFoundException e){
    e.printStackTrace();
}catch (IOException e){
    e.printStackTrace();
}

 


The output:

これはテストです。
高松
日本
米国
英国
世界

‹ Get available target ids for a Transliterator source id up Write a Shift_JIS Japanese file ›
  • file
  • io
  • unicode
  • Printer-friendly version
  • Add new comment

If you are testing any of these recipes in Eclipse and the characters are not displaying correctly in your console visit http://i18ncookbook.com/eclipse_settings.

This site is ad supported.  I hope you find something among our sponsors worth clicking. ;)

i18n search

Google
Custom Search

Search

Tags in Tags

calendar date icu4j Java Locale number format numberformat parse spellout timezone transliteration transliterator
more tags

User login

  • Create new account
  • Request new password
  • java cookbook
  • about the author