1. Home
  2. Sample Projects
  3. Filtering Résumés

Filtering Résumés

◷ Reading Time: 8 minutes

The example shows how to use the PDF extension to read résumés in PDF version and check whether the required skills are there in the résumé.

Scenario

A company is hiring a new software developer. Once the résumés are received in PDF format, it requires to filter whether certain keywords are present in those résumés. The created flow saves the time of HR team as they don’t have to manually go through each résumé to see whether the candidates have the required skills.

How to Run

1. Open the file ‘Flow.xml’.

2. Click on Debug with JSON composer.

Detail view of the Debug with JSON composer option in the Designer main toolbar.

3. The Input JSON composer modal will open. From the mode dropdown bottom left of the modal, select Text Mode and enter the following JSON with the keyword list.

{
    "keywords":["java", "python", "matlab"]
}
Input JSON composer modal showing the Text mode option selected from the mode menu and the Keyword JSON entered

4. Click OK

5. The document will now be in debug mode and the Start node of the flow diagram will be highlighted. To run step by step through the diagram nodes, click the Step In option from the Designer toolbar (or hit F11on keyboard).

Detail view of the debug navigation options in the Designer toolbar. The Step In option is selected.

Once the execution is completed successfully, the result of the keyword matching percentage can be seen in the Parameters window. In this example the value is 66.67%.

Parameters view showing the defined parameters of the project and the matchingPercentage parameter highlighted. This is showing a value of 66.67%

If there is no match for any of the keyword values, this will display in the Notifications window.

Notifications view showing the keyword "python" as having no match in the sample resume.

Project Description

Flow was created to filter the PDFs. At the start a user can provide a list of keywords that need to be cross-checked with the résumés. The flow will check how many keywords are in the résumés and provide the percentage of matching keywords.

The flow diagram for the project.

Video Description

Process Steps

1. Create a Generic Flow file.

New document modal showing the Business Logic document type menu with the Flow document type selected

2. Open the Variables Definitions modal from the Flow document toolbar and define the following variables:

NameDirectionTypeDescription
keywordsInexpressionList of keywords to input
pdfLocalStore the PDF input file
keywordLocalstringStore each keyword in the loop ‘Keywords to search in PDF’
keywordRegexLocalstringRegular expression for each keyword
keywordContainingPageNumberLocalstringOnce a keyword is found pages, store the page numbers
numberOfMatchingKeywordsLocalintCount the number of matching keywords (default ‘value’ should be set to 0)
matchingPercentageLocaldecimalPercentage of the matching keywords
Variable Definition modal showing the Variables added and the Keywords variable selected and showing the configuration of that variable

3. Navigate back to the Flow canvas and add a Start node from the document toolbox. This is the solid green circle with a light weight keyline.

Detail view of the Start node, a solid green circle with a light weight keyline.

4. Now add and Activity node to load the PDF file. The Activity node is a blue rounded corner rectangle. Double click the node to change the node label. Or navigate to the appearance tab in Properties and add the label in the field labeled ‘Text’.

Once added, select the Activity node and under the settings tab in the Properties menu, find the Expression field and enter the following expression:

pdf = 'Resume.pdf'|toPdf({textExtractionMethod:'words'})

5. Add a Loop node to go through each kwyword in the input list keywords.

List Source: keywords
Item Name: keyword
Initializer: Build Regex node
Finalizer: Count matching keywords node

Add the following nodes inside the loop.

6. Add an Activity node to build the regex for each word

keywordRegex= "(?i)\\b" + keyword

7. Add an Activity node to cross-check the words with the PDF file and get the indexes of the pages

keywordContainingPageNumber=pdf|pdfIndexOf(keywordRegex, 0)

8. Add a Decision node. If the keyword was found, proceed. If not, show a notification using a Notification node.

Found Condition: keywordContainingPageNumber|length()>0
Notification Message: No match found

9. Add an Activity node to count matching keywords

numberOfMatchingKeywords+=1

Add the following nodes outside the loop.

10. Add a Decision node. If at least one keyword was found, proceed. If not, show a notification.

Yes Condition: numberOfMatchingKeywords>0
Notification Message: No matching keywords

11. Add an Activity node to determine the percentage of matching keywords

matchingPercentage = ((numberOfMatchingKeywords|asDecimal()/keywords|length())*100)|round(2)

12. Add an End node to end the flow.

Download the project

The project can be downloaded using the attachment at the end of the page.

Updated on February 9, 2024

Article Attachments

Was this article helpful?

Related Articles