Category / Section
How to extract text from a PowerPoint presentation?
1 min read
In PowerPoint presentation, text is always associated with shapes. Text can be added, modified, and extracted from auto-shapes like text box, rectangle, oval, partial circle, etc. Use the following code sample to extract text from PowerPoint presentation.
//Load the PowerPoint presentation
IPresentation presentation = Presentation.Open("Sample.pptx");
//Text collection to store the extracted text
List<string> textCollection = new List<string>();
//Iterate each slide in a presentation
foreach (ISlide slide in presentation.Slides)
{
//Iterate all the shapes in the slide to get the text
foreach (IShape shape in slide.Shapes)
{
//Check the shape is table
if (shape is ITable)
{
ITable table = shape as ITable;
//Iterate all the cells in the table and gets the text
foreach (IRow row in table.Rows)
{
foreach (ICell cell in row.Cells)
{
//Get the text from the cell body
string text = cell.TextBody.Text;
//Add the extracted text into string collection.
textCollection.Add(text);
}
}
}
else
{
//Iterate all the paragraphs in the shape and gets the text
foreach (IParagraph paragraph in shape.TextBody.Paragraphs)
{
foreach (ITextPart textpart in paragraph.TextParts)
{
//Get the text from the paragraph
string text = textpart.Text;
//Add the extracted text into string collection
textCollection.Add(text);
}
}
}
}
}
//Write the text collection to a text file
System.IO.File.WriteAllLines("Sample.txt", textCollection);
//Dispose the presentation instance
presentation.Close();
You can download the sample here.
Did not find the solution
Contact Support