Getting Some Text from a PDF Page

The PlugPDF SDK supports text extraction with the following API methods.

Getting sections of text from PDF pages is a piece of cake with the PlugPDF SDK. The selected text can be saved in a file, stored on your clipboard, it can be attached to an email, and even be pasted in a text message. Let’s see this with an example. We are extracting the text included in the red square illustrated below.

Screenshot 2015-01-15 17.29.58

First of all, we must create a PDF Document instance and get the page size.

Now the RectF value must be calculated from the document size (not the screen size).

As you can see, the extractText method requires both a page index and an area, and returns a String value containing the text in that area, at the page passed as a parameter.

The difference between document.extractTextRects and extractText is the result value type. The document.extractTextRects method returns the ArrayList object containing all the rects; in this case, each of those rect values represents the lines.

Here come the pieces of this simple puzzle all put together.

The code above generates this result log.

In case you want to extract all the text from the PDF document please change the area value.

