Manual data entry is a silent productivity killer. Every day, thousands of businesses lose hundreds of collective hours to a tedious, error-prone chore: opening a PDF document, highlighting text, hitting Ctrl+C, opening a spreadsheet, and pressing Ctrl+V. If your organization handles invoices, purchase orders, shipping manifests, or financial statements, you have likely asked yourself if there is a better way.
Fortunately, there is. By leveraging Microsoft Power Automate, you can completely automate the journey from a static PDF document to a structured, dynamic Excel sheet. In this comprehensive, step-by-step guide, we will explore how to build a highly efficient power automate pdf to excel workflow. Whether you want to harness cutting-edge artificial intelligence in the cloud, leverage a completely free desktop robotic process automation (RPA) workflow, or even handle the reverse path to convert excel to pdf power automate processes, this expert guide has you covered.
Why Automate PDF to Excel Workflows?
PDFs are designed for consistent display across different devices, not for data manipulation. Excel spreadsheets, on the other hand, are the lifeblood of business analysis, reporting, and operations. When you build an automated path between them, you unlock several major operational advantages:
- Zero Copy-Paste Errors: Humans make typos, skip lines, and misread decimal points. Automations process data with absolute precision.
- Drastic Time Savings: A flow can extract, validate, and write data from dozens of PDFs to an Excel file in seconds—a task that would take a human clerk hours.
- Operational Scalability: Whether you receive 5 invoices a day or 5,000, your automated pipeline scales effortlessly without requiring additional headcount.
- Real-Time Visibility: By automating data extraction, dashboards and financial ledgers update the instant a document is received, rather than waiting for weekly manual processing runs.
Let’s dive into the two primary architectural approaches to achieve a pdf to excel power automate pipeline: Cloud Flows utilizing AI Builder, and Desktop Flows using free RPA actions.
Method 1: The Cloud-First Approach (AI Builder & Document Processing)
If you receive PDF files digitally via email, SharePoint, or OneDrive, Microsoft’s Cloud Flows combined with AI Builder (specifically the Document Processing model) offer the most robust enterprise-grade solution. This method uses OCR (Optical Character Recognition) and machine learning to understand the context of your PDFs.
Prerequisites
- A Microsoft Power Automate premium license or an active trial.
- Access to AI Builder credits within your environment.
- A formatted Excel file saved in OneDrive or SharePoint containing an official Excel Table (use Ctrl+T to convert raw columns into a table).
Step 1: Create and Train Your AI Model
Before building the flow, you must teach Power Automate how to read your specific PDF layout.
- Log in to the Power Automate Maker Portal.
- In the left navigation bar, expand AI Hub and click on AI models.
- Select Document processing (formerly Form Processing) and choose the Structured documents model type.
- Define the exact fields you want to extract. For example:
- Single fields:
InvoiceNumber,InvoiceDate,VendorName,TotalDue - Tables: Create a table field (e.g.,
LineItems) and define columns likeDescription,Quantity, andAmount.
- Single fields:
- Upload at least 5 sample PDF documents that share the same visual structure.
- Draw boundaries around the text in your sample PDFs, mapping the highlighted zones to the fields you created in step 4.
- Click Train, wait a few minutes for the system to process, and then click Publish to make your model active.
Step 2: Build the Power Automate Cloud Flow
Now, construct the automated flow to pass incoming PDFs directly to your newly trained model and write the extracted data to Excel.
- In the Maker Portal, click Create > Automated cloud flow.
- Choose a trigger. For instance, When a new file is created in a folder (SharePoint) or When a new email arrives (Office 365 Outlook) containing attachments.
- Add a new action: Get file content (if using SharePoint) to fetch the raw PDF data.
- Add the AI Builder action: Extract information from documents.
- Select your trained Document Processing model from the dropdown list. Set the Document type to
application/pdfand pass the File Content output from the previous step into the Document field. - Add the Excel Online (Business) action: Add a row into a table.
- Point the action to your Excel file located in SharePoint or OneDrive. Select the specific Table within that spreadsheet.
- Map your fields: Click into the inputs for your Excel columns, and select the corresponding dynamic outputs from the AI Builder extraction step (e.g., map
InvoiceNumber valueto the Excel Invoice Number column).
Step 3: Handling Tabular/Line-Item Data
If your PDF contains a table of line items, simply using "Add a row into a table" once won't work because there are multiple lines of data.
- Add an Apply to each control block after the AI Builder action.
- In the input field of the Apply to Each action, select the dynamic array output representing your extracted table (e.g.,
LineItems). - Inside the loop, add the Add a row into a table action.
- Map the columns of your Excel table to the individual item fields of your loop array (e.g.,
items('Apply_to_each')?['Description']). This ensures that if an invoice has ten items, ten distinct rows are appended to your Excel sheet sequentially.
Note on Licensing: While highly efficient, this cloud-based power automate convert pdf to excel process requires AI Builder capacity, which may carry extra costs for high-volume environments.
Method 2: The Free Desktop Approach (Power Automate Desktop)
If you are on a budget, do not have a premium Power Automate license, or are dealing with PDFs stored locally on your physical machine, Power Automate Desktop (PAD) is your best choice. PAD is completely free with Windows 10 and 11 and lets you build robotic process automations locally.
Prerequisites
- Power Automate Desktop installed on your PC.
- A local directory containing the target PDF files.
- A local Excel file to write the results into.
Step 1: Initialize Variables and Files
- Launch Power Automate Desktop and click New Flow. Name it "Extract PDF to Excel".
- Add the Get files in folder action. Set the folder path to your directory of local PDFs. This creates a list variable called
%Files%containing all file pathways. - Add the Launch Excel action. Select "and open the following document" and specify the path to your target Excel spreadsheet. Save the produced variable as
%ExcelInstance%. - Initialize a numeric variable named
%RowCounter%with a value of2(assuming row 1 contains your header labels).
Step 2: Loop and Extract Text
Now we will iterate through each individual document in the folder to extract and parse the text content.
- Add a For each loop. Set the value to iterate over to
%Files%and store the current item in%CurrentFile%. - Inside the loop, drag in the Extract text from PDF action. Set the PDF file to
%CurrentFile.FullName%. Set the extraction mode to "All" and output the raw text to a variable named%ExtractedText%.
Step 3: Parse Data Using Regular Expressions (RegEx)
Raw text extracted from a PDF looks like one long, chaotic string. To pull out specific values like an invoice number, we use Regular Expressions (RegEx).
- Add the Parse text action inside the loop.
- Set Text to parse to
%ExtractedText%. - Under Is Regular Expression, toggle it to True. Provide a matching pattern. For example, if your PDFs always display the invoice number as
Invoice #: 12345, use the RegEx pattern(?<=Invoice\s#:\s*)(\d+)to target the numbers trailing that exact phrase. - Save the match to a variable (e.g.,
%InvoiceNumberMatches%). - Repeat this parsing process for other values you need to capture, such as dates or total balances.
Step 4: Write Captured Data to Excel
- Inside the loop, add a Write to Excel worksheet action.
- Set the Excel instance to
%ExcelInstance%. - Under Value to write, pass the first item of your RegEx match array:
%InvoiceNumberMatches[0]%. - Set Write mode to "On specified cell". Set the Column to
Aand the Row to%RowCounter%. - Repeat this step for additional parsed fields, writing them to Columns B, C, D, etc., on the same
%RowCounter%. - At the end of the loop, drag in the Increase variable action. Increase
%RowCounter%by1so the next PDF's data writes to the subsequent line instead of overwriting the previous run.
Step 5: Save and Close
- Outside of the For Each loop, add the Close Excel action.
- Select "Save document before closing Excel".
This desktop approach is robust, fast, and completely free. While parsing with RegEx requires trial and error, it provides extreme flexibility for structured text-based PDFs.
Method 3: Reverse the Flow (Converting Excel to PDF with Power Automate)
Many businesses require a dual-direction workflow. Once you have compiled data in Excel, you may need to generate a read-only PDF report to email to customers or archive in SharePoint. To accomplish this, you will need a power automate excel to pdf flow.
While there are premium tools to do this, there is an incredibly popular, completely free "hack" using standard OneDrive for Business actions in Cloud Flows.
The Free OneDrive File-Conversion Trick
- Trigger: Choose your starting point, such as When a file is created or modified (properties only) on SharePoint, or a manual button trigger.
- Get File Content: Add a SharePoint Get file content action to read the source
.xlsxspreadsheet. - Create Temp File in OneDrive: Add the OneDrive for Business action Create file. Choose a temporary directory, name the file dynamically (e.g.,
Report_@{utcNow('yyyy-MM-dd')}.xlsx), and map the File Content from step 2. - Convert File: Add the OneDrive for Business action Convert file. Set the File to the unique Id returned by the OneDrive Create file step. Change the Target type to PDF from the dropdown menu.
- Save PDF Back to SharePoint: Add a SharePoint Create file action. Choose your destination document library, name the file with a
.pdfextension, and map the File Content output directly from the OneDrive Convert file action. - Clean Up OneDrive: To maintain tidy directories and prevent data leaks, add a OneDrive Delete file action at the very end of your flow to remove the temporary Excel and PDF files created in OneDrive.
This simple, clever workaround allows you to convert excel to pdf power automate flows cleanly without paying for premium connectors like Adobe or Muhimbi.
Pro Tips for Flawless PDF-Excel Workflows
Building your first workflow is easy, but making it highly resilient to real-world edge cases requires fine-tuning. Implement these professional best practices to ensure your flows run smoothly:
1. Account for Scanned PDFs (OCR Challenges)
Standard text-extraction actions in Power Automate assume the PDF is digital (generated directly from software like Word or Excel). If someone prints a document, writes on it, and scans it, simple text extraction will fail.
- Cloud Flows: AI Builder handles scans natively with advanced cloud-based OCR.
- Desktop Flows: If using PAD, swap out "Extract text from PDF" for Extract text from PDF with OCR. You can choose between the Windows OCR engine or Tesseract for superior character recognition on scanned images.
2. Handle Document Variations with Classification Models
If your intake directory receives multiple document types (e.g., some are utility bills, some are receipts, some are bank statements), passing them all into a single Document Processing model will result in massive errors.
- Use AI Builder Document Classification to first analyze incoming files.
- Set up a conditional branch (Switch control) in Power Automate. If the classification is "Invoice," route it to Model A; if it is "Bank Statement," route it to Model B.
3. Always Implement Error Handling
Automations can break if an Excel file is locked by another user, or if a PDF is password-protected.
- In Cloud Flows, click the three dots on your Excel actions and configure Run after settings. If the primary write action fails, set up a branch that emails an administrator with a notification.
- In Desktop Flows, use On error blocks to gracefully capture failures, close open background instances of Excel, and write a status code to a system log instead of letting the entire local flow crash.
Frequently Asked Questions
Can I extract data from a PDF to Excel using Power Automate without a Premium license?
Yes! By using Power Automate Desktop (PAD), which is free with Windows 10/11, you can extract text from local PDFs and write them into Excel without any Premium licensing or subscription fees. Premium licenses are only required for cloud-based automation triggers, shared cloud environments, or when utilizing cloud-based AI Builder modules.
Why does my OneDrive Excel-to-PDF conversion generate blank or poorly formatted pages?
This is almost always due to the default print settings in the source Excel file. Power Automate converts the Excel file based on the spreadsheet's active print configurations. To fix this, open your Excel template, go to the Page Layout tab, configure your Print Area, set scaling to Fit All Columns on One Page, and save the template before running the conversion flow.
How many sample PDFs do I need to train the AI Builder Document Processing model?
Microsoft requires a absolute minimum of 5 sample documents of the exact same layout. However, to achieve enterprise-level extraction accuracy (over 95%), it is highly recommended to upload and tag between 15 to 20 samples representing varied data inputs.
Can Power Automate handle multi-page PDFs with tables that span across pages?
Yes. AI Builder is highly intelligent and can extract multi-page tables seamlessly, grouping them into a unified dataset. In Power Automate Desktop, you may need to use advanced RegEx loops or text-splitting logic to continually read and append lines until a "Page End" indicator is found.
How secure is my data when converting PDFs using Power Automate?
Power Automate operates within the highly secure Microsoft Power Platform ecosystem, which complies with GDPR, HIPAA, and major global data-security standards. Data processed in cloud flows remains encrypted in transit and at rest. If your company handles highly sensitive records, you can keep the data entirely on-premise by running local Desktop flows.
Conclusion
Automating the tedious cycle of shifting data between PDFs and Excel sheets is one of the fastest ways to inject efficiency into your business operations. By choosing the power automate pdf to excel pathway that fits your organizational budget and infrastructure—whether that is the AI-driven precision of Cloud AI Builder or the highly customizable, free architecture of Power Automate Desktop—you eliminate manual friction and free up your team to focus on high-value analysis.
Start small: build a simple 5-step desktop flow or run a trial of AI Builder with a few sample documents. Once you experience the magic of seeing physical records transform into structured, ready-to-analyze Excel rows automatically, you will never go back to manual copy-pasting again.





![How to Convert XLSX to CSV Without Opening [5 Fast Ways]](https://blog.assetly.work/image/covers/_generic/04.webp)



