How to convert word file to excel file by extract table data from a Word document and add those data into a worksheet in .NET?
Syncfusion® Essential® DocIO is a .NET Word library used to create, read, edit, and convert Word documents programmatically without Microsoft Word or interop dependencies. Using this library, you can convert word file to excel file by extract table data from a Word document and add those data into a worksheet in .NET using C#.
Steps to convert word file to excel file by extract table data from a Word document and add those data into a worksheet in .NET:
-
Create a new .NET Core console application project.
-
Install the Syncfusion.DocIO.Net.Core NuGet package as a reference to your project from NuGet.org.
Starting with v16.2.0.x, if you reference Syncfusion® assemblies from trial setup or from the NuGet feed, include a license key in your projects. Refer to the link to learn about generating and registering a Syncfusion® license key in your application to use the components without trail message.
- Include the following namespaces in the Program.cs file
C#
using Syncfusion.DocIO;
using Syncfusion.DocIO.DLS;
using Syncfusion.XlsIO;
- Use the following code example to convert word file to excel file by extract table data from a Word document and add those data into a worksheet in .NET.
C#
// Open an existing word document
using (FileStream inputfileStream = new FileStream(Path.GetFullPath(@"Data/Input.docx"), FileMode.Open))
{
using (WordDocument document = new WordDocument(inputfileStream, FormatType.Automatic))
{
using (ExcelEngine engine = new ExcelEngine())
{
IApplication app = engine.Excel;
app.DefaultVersion = ExcelVersion.Excel2016;
// Create one sheet to start with; we’ll add sheets as we find more tables.
IWorkbook workbook = app.Workbooks.Create(1);
int sheetIndex = 0;
int tableNumber = 0;
// Get table entities in word document
List<Entity> entities = document.FindAllItemsByProperty(EntityType.Table, null, null);
foreach (Entity entity in entities)
{
WTable wTable = (WTable)entity;
if (sheetIndex >= workbook.Worksheets.Count)
workbook.Worksheets.Create();
IWorksheet worksheet = workbook.Worksheets[sheetIndex++];
worksheet.Name = $"Table{++tableNumber}";
// Export with merges
ExportWordTableToExcel(wTable, worksheet);
// Formatting
worksheet.UsedRange.AutofitRows();
worksheet.UsedRange.AutofitColumns();
}
using (FileStream outputStream = new FileStream(Path.GetFullPath(@"Output/Result.xlsx"), FileMode.Create))
{
workbook.SaveAs(outputStream);
}
workbook.Close();
}
}
}
The following code example provides supporting methods for the above code.
static void ExportWordTableToExcelMerged(IWTable table, IWorksheet worksheet)
{
for (int r = 0; r < table.Rows.Count; r++)
{
WTableRow wRow = (WTableRow)table.Rows[r];
// Map Word's logical grid to Excel columns using GridSpan
int gridCol = 1;
for (int i = 0; i < wRow.Cells.Count; i++)
{
WTableCell wCell = wRow.Cells[i];
// Horizontal width in grid columns
int hSpan = (int)wCell.GridSpan;
// Merge flags
CellMerge vFlag = wCell.CellFormat.VerticalMerge; // None | Start | Continue
CellMerge hFlag = wCell.CellFormat.HorizontalMerge;
// Excel start cell for this Word cell
int excelStartRowIndex = r + 1;
int excelStartColIndex = gridCol;
// Compute vertical span when this cell is the START of a vertical merge
int vSpan = 1;
if (vFlag == CellMerge.Start)
{
// Count how many subsequent rows continue the merge at the same grid column
for (int nr = r + 1; nr < table.Rows.Count; nr++)
{
WTableRow nextRow = (WTableRow)table.Rows[nr];
WTableCell nextCell = GetCellAtGridColumn(nextRow, excelStartColIndex); // 1-based grid col
if (nextCell != null && nextCell.CellFormat.VerticalMerge == CellMerge.Continue)
vSpan++;
else
break;
}
}
if (hFlag == CellMerge.Start)
{
for( int nc = i + 1; nc < wRow.Cells.Count; nc++)
{
WTableCell cell = wRow.Cells[nc];
if (cell != null && cell.CellFormat.HorizontalMerge == CellMerge.Continue)
hSpan += cell.GridSpan;
else
break;
}
}
// Is Start or None of a merge region
bool isNotContinuedCell =
(vFlag != CellMerge.Continue) &&
(hFlag != CellMerge.Continue);
if (isNotContinuedCell)
{
int vMergeEndIndex = excelStartRowIndex + vSpan - 1;
int hMergeEndColIndex = excelStartColIndex + hSpan - 1;
// Merge in Excel if region spans multiple cells
if (vMergeEndIndex > excelStartRowIndex || hMergeEndColIndex > excelStartColIndex)
worksheet.Range[excelStartRowIndex, excelStartColIndex, vMergeEndIndex, hMergeEndColIndex].Merge();
// Write the visible text to the top-left Excel cell
IRange range = worksheet.Range[excelStartRowIndex, excelStartColIndex];
range.Text = BuildCellText(wCell);
// Format styling
range.CellStyle.HorizontalAlignment = ExcelHAlign.HAlignCenter;
range.CellStyle.VerticalAlignment = ExcelVAlign.VAlignCenter;
worksheet.Range[excelStartRowIndex, excelStartColIndex, vMergeEndIndex, hMergeEndColIndex].CellStyle.Borders[ExcelBordersIndex.EdgeLeft].LineStyle = ExcelLineStyle.Thin;
worksheet.Range[excelStartRowIndex, excelStartColIndex, vMergeEndIndex, hMergeEndColIndex].CellStyle.Borders[ExcelBordersIndex.EdgeRight].LineStyle = ExcelLineStyle.Thin;
worksheet.Range[excelStartRowIndex, excelStartColIndex, vMergeEndIndex, hMergeEndColIndex].CellStyle.Borders[ExcelBordersIndex.EdgeTop].LineStyle = ExcelLineStyle.Thin;
worksheet.Range[excelStartRowIndex, excelStartColIndex, vMergeEndIndex, hMergeEndColIndex].CellStyle.Borders[ExcelBordersIndex.EdgeBottom].LineStyle = ExcelLineStyle.Thin;
}
// Advance Excel column cursor by the horizontal span of this Word cell
gridCol += hSpan;
}
}
}
static WTableCell GetCellAtGridColumn(WTableRow row, int gridColumn)
{
int cursor = 1; // 1-based grid column within the table
foreach (WTableCell c in row.Cells)
{
int span = (int)c.GridSpan;
int start = cursor;
int end = cursor + span - 1;
if (gridColumn >= start && gridColumn <= end)
return c;
cursor += span;
}
return null;
}
static string BuildCellText(WTableCell cell)
{
StringBuilder sb = new StringBuilder();
for (int p = 0; p < cell.Paragraphs.Count; p++)
{
WParagraph para = cell.Paragraphs[p];
string text = para.Text?.TrimEnd();
if (!string.IsNullOrEmpty(text))
{
if (sb.Length > 0) sb.AppendLine();
sb.Append(text);
}
}
return sb.ToString();
}
You can download a complete working sample to convert word file to excel file by extract table data from a Word document and add those data into a worksheet in .NET from the GitHub.
A sample output of convert word file to excel file by extract table data from a Word document and add those data into a worksheet is shown below:
Take a moment to peruse the documentation where you can find basic Word document processing options along with the features like mail merge, merge, split, and compare Word documents, find and replace text in the Word document, protect the Word documents, and most importantly, the PDF and Image conversions with code examples.
Conclusion
I hope you enjoyed learning about how to convert word file to excel file by extract table data from a Word document and add those data into a worksheet in .NET.
You can refer to our ASP.NET Core DocIO feature tour page to know about its other groundbreaking feature representations and documentation, and how to quickly get started for configuration specifications. You can also explore our ASP.NET Core DocIO example to understand how to create and manipulate data.
For current customers, you can check out our components from the License and Downloads page. If you are new to Syncfusion®, you can try our 30-day free trial to check out our other controls.
If you have any queries or require clarifications, please let us know in the comments section below. You can also contact us through our support forums, Direct-Trac, or feedback portal. We are always happy to assist you!