Java Convert .docx File to .html File using XDocReport
Tags: docx html XDocReport
In this Java tutorial we learn how to convert a Word file to HTML file in Java using the XDocReport library.
Table of contents
- Add XDocReport Converter DOCX XWPF Dependency to Java Project
- How to convert .docx file to .html file in Java
- How to Use FileConverter Class to convert Word to HTML File
Add XDocReport Converter DOCX XWPF Dependency to Java Project
If you use Gradle build project, add the following dependency to the build.gradle file.
implementation group: 'fr.opensagres.xdocreport', name: 'fr.opensagres.xdocreport.converter.docx.xwpf', version: '2.0.3'
If you use Maven build project, add the following dependency to the pom.xml file.
<dependency>
<groupId>fr.opensagres.xdocreport</groupId>
<artifactId>fr.opensagres.xdocreport.converter.docx.xwpf</artifactId>
<version>2.0.3</version>
</dependency>
How to convert .docx file to .html file in Java
In Java, with a given Word file we can use the XDocReport API with the following steps to convert it to a HTML file.
- Step 1: Open the .docx file as an InputStream using FileInputStream.
- Step 2: Create new XWPFDocument object using the XWPFDocument(InputStream is) constructor.
- Step 3: Create new instance of XHTMLOptions using the XHTMLOptions.create() static method.
- Step 4: Write the .html file as an OutputStream using FileOutputStream.
- Step 5: Use the XHTMLConverter.getInstance().convert( XWPFDocument document, OutputStream out, T options ) method to convert the .docx file to .html file.
In the FileConverter Java class below, we implement the convertWordToHtml(String docxFileName, String htmlFileName) method to convert .docx file to .html file with given file names.
FileConverter.java
import fr.opensagres.poi.xwpf.converter.xhtml.XHTMLConverter;
import fr.opensagres.poi.xwpf.converter.xhtml.XHTMLOptions;
import org.apache.poi.xwpf.usermodel.XWPFDocument;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.FileNotFoundException;
import java.io.InputStream;
import java.io.IOException;
import java.io.OutputStream;
public class FileConverter {
public void convertWordToHtml(String docxFileName, String htmlFileName) {
try(InputStream inputStream = new FileInputStream(docxFileName);
OutputStream outputStream = new FileOutputStream(htmlFileName)) {
XWPFDocument document = new XWPFDocument(inputStream);
XHTMLOptions options = XHTMLOptions.create();
// Convert .docx file to .html file
XHTMLConverter.getInstance().convert(document, outputStream, options);
} catch (FileNotFoundException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
}
}
How to Use FileConverter Class to convert Word to HTML File
For example, we have a sample Word file located at D:\SimpleSolution\Data\Document.docx with the content as the screenshot below.
In the following example Java program, we use the FileConverter class in the previous step to convert the sample Word file above to a HTML file.
ConvertDocxToHtmlExample1.java
public class ConvertDocxToHtmlExample1 {
public static void main(String... args) {
String docxFileName = "D:\\SimpleSolution\\Data\\Document.docx";
String htmlFileName = "D:\\SimpleSolution\\Data\\Document.html";
FileConverter fileConverter = new FileConverter();
fileConverter.convertWordToHtml(docxFileName, htmlFileName);
}
}
Execute the Java application, we have the HTML file be generated at D:\SimpleSolution\Data\Document.html, open in the browser we have the screenshot below.
Happy Coding 😊