PDF is more and more popular used in read world businees now. It can hold text, image, table, list and nearly all the content in the pdf page. As it's not be able modified by the receivers, it always used to as invoice or some other important information file.
However, there is so much need to get the content out of PDF, or convert PDF document to other formats, such as images, Office Word or Office Excel document. iDiTect provide processinal PDF editing and processing library and SDK to help C# and VB .NET developers to help this problem. This C# component support to convert pdf to images and extract text from pdf.
Convert PDF document to image files using C# .NET, iDiTect.Converter C# tool can be used in any of your Winforms, Console application and ASP.NET web projects. You can convert pdf page to raster image format, such as jpg/jpeg, png, tif/tiff, bmp and gif. The converted image's dpi can be customized by needed, large dpi is leading to high quality images. If you want to optimize the output jpg image size, you can compress the jpg using the api embedded in the library, it's so easy and simple to use.
Besides, C# developers can convert entire pdf document to one multiple pages tiff image. The converted multi-page tiff will keep the document structure of the original pdf file, and contains the same page count of the original pdf file.
Extracting text from PDF using C# and VB.NET, iDiTect.Converter .NET toolkit help developers to extract text from PDF, the output text will keep the layout in the PDF page. Text in the pdf header and footer can be recognized, text in the pdf table, list, paragraph and other sections can be also found and extracted. After converting pdf to text, you can modify the output text, such as find target word, or replace, cut or save it.
In additional, you can find and search text in pdf directly. iDiTect provides well designed api to search word text in pdf using C#.