Converting HTML files to RST via PyPandoc Python Library

Converting HTML files to RST via PyPandoc Python Library

Let's see a procedure to convert HTML to RST file using the PyPandoc library by using Python.

An RST file is a document that contains code that is written in the reStructuredText markup language. The reStructuredText is used to apply basic styles and format the plain texts. They are used mostly for technical documentation of python programs.

In order to convert the documents into RST files, we will be using the following 2 approaches:

  1. Using Pandoc

  2. Using Python code to convert an HTML file to an RST file.Using python code for file conversion:

1. Using Pandoc:

Pandoc is a document conversion utility that is used to convert various files into different formats. In this article, we will be converting an Html file to an RST file.

So first we will convert a single HTML file to an RST file using the following steps:

  1. Open the terminal in your system

  2. Navigate the directory where the file that you wish to convert is stored. You can refer to the following link to know basic command prompt commands

3. Now the next step is to run the following command in the terminal :

C:\Users\lenovo> pandoc filename –o newname

where the filename = name of the file that you wanted to convert and newname = name that you wish to give to the converted file.

for example:

filename = file to convert for eg mytext.html

newname = new file name for eg mytext.rst (we are converting from Html to rest)

It will look like the following screenshot :

4. Finally you can see that the HTML file has been converted to an RST file.

  • Now we will be converting multiple HTML files to RST files using the pandoc library.

In order to use the pandoc library, we need to first install it. We can use the following command to install pandoc for python:

C:\Users\lenovo>pip install pypandoc

Now, the next step is to write the following in your python code:

After executing the above code, all the files in that folder will get converted to RST files.

In this way, it becomes much easier to convert all files from a folder to another format using python. It becomes easier than following the terminal way when you have multiple files to convert.

2. Using python code to convert all the files to RST files:

This approach is similar to the previous approach just the difference is, that here the code is not using the pypandoc library.

Write the following code for converting the file to RST format.

As we can see in the output we got the command to execute for each file.

To use the code, just refer here.

Now, we can just copy the output and paste it into a terminal and all commands are executed. We have all our .rst files converted from .html files.

For any further Queries or anything related to Python Development ,Coding, Blogging, Tech Documentation you can DM me on Linkedin or instagram id=acanubhav94.

Special credits to my team members: Siddhid and Anshika