Last Updated:

Apache POI Library Overview - Working with Microsoft Office Applications in Java

Apache POI is a powerful library for working with various documents of the Microsoft Office suite in Java. It includes AN API for reading and creating new Word, Excel, Visio, and working with other MS Office file formats.

Interesting fact. The name of the Apache POI library is an abbreviation for "Poor Obfuscation Implementation", which literally translates to "Poorly Implemented Obfuscation" - this name was a joke invented by programmers with a good sense of humor, but later it became official.

Who does not know, obfuscation is the deliberate obfuscation of code to complicate the analysis of the structure of the program and algorithms when decompiling the application.

To work with Apache POI, you need to enable your program's project. If you are using maven, you can add the following code depending on the project (choose the stable version):

or download it on the official website and paste it manually.

When working with the library, unusual class names immediately catch your eye. For example, classes for working with Excel have the prefix HSSF: , and others. This HSSF prefix stands for Horrible SpreadSheet Format or "Horrible SpreadSheet Format"!HSSFWorkbookHSSFSheet

Let's take a look at the other classes in the Apache POI library:

  1. The class is used to read and write Microsoft Excel files of xls format.HSSF (Horrible Spreadsheet Format)
  2. The class is used to read and write files in the Office Open XML Format (.xlsx).XSSF (XML Spreadsheet Format)
  3. The class is used to work with the basic information about Microsoft Office suite files.HPSF (Horrible Property Set Format)
  4. Class - For reading and writing Microsoft Word 97 application files (.doc format).HWPF (Horrible Word Processor Format)
  5. Class – For reading and writing Microsoft PowerPoint application files.HSLF (Horrible Slide Layout Format)
  6. Class - For reading and writing Microsoft Visio application files.HDGF (Horrible DiaGram Format)
  7. The class is used to work with Microsoft Publisher files.HPBF (Horrible PuBlisher Format)
  8. The class is used to work with Microsoft Outlook MSG files.HSMF (Horrible Stupid Mail Format)
  9. The package is used to decode the Microsoft Office Drawing format.DDF (Dreadful Drawing Format)

Practice on Apache POI

Excel:

  • Here is a detailed article with an example of reading data from Excel file formats and .xlsxlsx
  • Write data to an xls file (create a new Excel document).

Word:

  • Create a Word document in docx format using Apache POI.
  • Read word document data (headers and footers, paragraphs).