loading

How to Convert HTML to MS Word Document using PHP

How to Convert HTML to MS Word Document using PHP

How to Convert HTML to MS Word Document using PHP

0 Sales

Free

HTML to MS Word Document conversion is mostly used in online applications to create a.doc/.docx file with dynamic HTML content. Microsoft Word is the most often used file format for exporting dynamic material for offline consumption. JavaScript may be used to simply provide the feature of exporting HTML material to MS Word. However, server-side intervention is required to convert dynamic material to Doc.

The server-side export to word capability is extremely handy for converting dynamic HTML material to a Microsoft Word document and downloading it as a.docx file. Using PHP, the MS Word document may be readily made using HTML information. In this article, we'll teach you how to use PHP to convert HTML to MS Word documents.

HTML To Word (DOC/DOCX) Library

HTML To Word (DOC/DOCX) Converter

The HTML TO DOC class is a PHP-based custom library that allows you to build MS Word documents and incorporate HTML-formatted information in them.

 - setDocFileName() – Changes the name of the document file.
 - setTitle() – Changes the title of the document.
 - getHeader() – Generate the document's header section.
 - getFotter() – Generate the document's footer section.
 - _parseHtml() – Parse and filter HTML from the source.
 - createDoc() – Create a word document in.dcox format.
 - write file() — Insert text into a word document.

<?php 
/** 
 * Convert HTML to MS Word document 
 * @name HTML_TO_DOC 
 * @version 2.0 
 * @author CodexWorld 
 * @link https://www.codexworld.com 
 */ 
class HTML_TO_DOC 
{ 
    var $docFile  = ''; 
    var $title    = ''; 
    var $htmlHead = ''; 
    var $htmlBody = ''; 
     
    /** 
     * Constructor 
     * 
     * @return void 
     */ 
    function __construct(){ 
        $this->title = ''; 
        $this->htmlHead = ''; 
        $this->htmlBody = ''; 
    } 
     
    /** 
     * Set the document file name 
     * 
     * @param String $docfile  
     */ 
    function setDocFileName($docfile){ 
        $this->docFile = $docfile; 
        if(!preg_match("/\.doc$/i",$this->docFile) && !preg_match("/\.docx$/i",$this->docFile)){ 
            $this->docFile .= '.doc'; 
        } 
        return;  
    } 
     
    /** 
     * Set the document title 
     * 
     * @param String $title  
     */ 
    function setTitle($title){ 
        $this->title = $title; 
    } 
     
    /** 
     * Return header of MS Doc 
     * 
     * @return String 
     */ 
    function getHeader(){ 
        $return = <<<EOH 
        <html xmlns:v="urn:schemas-microsoft-com:vml" 
        xmlns:o="urn:schemas-microsoft-com:office:office" 
        xmlns:w="urn:schemas-microsoft-com:office:word" 
        xmlns="http://www.w3.org/TR/REC-html40"> 
         
        <head> 
        <meta http-equiv=Content-Type content="text/html; charset=utf-8"> 
        <meta name=ProgId content=Word.Document> 
        <meta name=Generator content="Microsoft Word 9"> 
        <meta name=Originator content="Microsoft Word 9"> 
        <!--[if !mso]> 
        <style> 
        v\:* {behavior:url(#default#VML);} 
        o\:* {behavior:url(#default#VML);} 
        w\:* {behavior:url(#default#VML);} 
        .shape {behavior:url(#default#VML);} 
        </style> 
        <![endif]--> 
        <title>$this->title</title> 
        <!--[if gte mso 9]><xml> 
         <w:WordDocument> 
          <w:View>Print</w:View> 
          <w:DoNotHyphenateCaps/> 
          <w:PunctuationKerning/> 
          <w:DrawingGridHorizontalSpacing>9.35 pt</w:DrawingGridHorizontalSpacing> 
          <w:DrawingGridVerticalSpacing>9.35 pt</w:DrawingGridVerticalSpacing> 
         </w:WordDocument> 
        </xml><![endif]--> 
        <style> 
        <!-- 
         /* Font Definitions */ 
        @font-face 
            {font-family:Verdana; 
            panose-1:2 11 6 4 3 5 4 4 2 4; 
            mso-font-charset:0; 
            mso-generic-font-family:swiss; 
            mso-font-pitch:variable; 
            mso-font-signature:536871559 0 0 0 415 0;} 
         /* Style Definitions */ 
        p.MsoNormal, li.MsoNormal, div.MsoNormal 
            {mso-style-parent:""; 
            margin:0in; 
            margin-bottom:.0001pt; 
            mso-pagination:widow-orphan; 
            font-size:7.5pt; 
                mso-bidi-font-size:8.0pt; 
            font-family:"Verdana"; 
            mso-fareast-font-family:"Verdana";} 
        p.small 
            {mso-style-parent:""; 
            margin:0in; 
            margin-bottom:.0001pt; 
            mso-pagination:widow-orphan; 
            font-size:1.0pt; 
                mso-bidi-font-size:1.0pt; 
            font-family:"Verdana"; 
            mso-fareast-font-family:"Verdana";} 
        @page Section1 
            {size:8.5in 11.0in; 
            margin:1.0in 1.25in 1.0in 1.25in; 
            mso-header-margin:.5in; 
            mso-footer-margin:.5in; 
            mso-paper-source:0;} 
        div.Section1 
            {page:Section1;} 
        --> 
        </style> 
        <!--[if gte mso 9]><xml> 
         <o:shapedefaults v:ext="edit" spidmax="1032"> 
          <o:colormenu v:ext="edit" strokecolor="none"/> 
         </o:shapedefaults></xml><![endif]--><!--[if gte mso 9]><xml> 
         <o:shapelayout v:ext="edit"> 
          <o:idmap v:ext="edit" data="1"/> 
         </o:shapelayout></xml><![endif]--> 
         $this->htmlHead 
        </head> 
        <body> 
EOH; 
        return $return; 
    } 
     
    /** 
     * Return Document footer 
     * 
     * @return String 
     */ 
    function getFotter(){ 
        return "</body></html>"; 
    } 
 
    /** 
     * Create The MS Word Document from given HTML 
     * 
     * @param String $html :: HTML Content or HTML File Name like path/to/html/file.html 
     * @param String $file :: Document File Name 
     * @param Boolean $download :: Wheather to download the file or save the file 
     * @return boolean  
     */ 
    function createDoc($html, $file, $download = false){ 
        if(is_file($html)){ 
            $html = @file_get_contents($html); 
        } 
         
        $this->_parseHtml($html); 
        $this->setDocFileName($file); 
        $doc = $this->getHeader(); 
        $doc .= $this->htmlBody; 
        $doc .= $this->getFotter(); 
                         
        if($download){ 
            @header("Cache-Control: ");// leave blank to avoid IE errors 
            @header("Pragma: ");// leave blank to avoid IE errors 
            @header("Content-type: application/octet-stream"); 
            @header("Content-Disposition: attachment; filename=\"$this->docFile\""); 
            echo $doc; 
            return true; 
        }else { 
            return $this->write_file($this->docFile, $doc); 
        } 
    } 
     
    /** 
     * Parse the html and remove <head></head> part if present into html 
     * 
     * @param String $html 
     * @return void 
     * @access Private 
     */ 
    function _parseHtml($html){ 
        $html = preg_replace("/<!DOCTYPE((.|\n)*?)>/ims", "", $html); 
        $html = preg_replace("/<script((.|\n)*?)>((.|\n)*?)<\/script>/ims", "", $html); 
        preg_match("/<head>((.|\n)*?)<\/head>/ims", $html, $matches); 
        $head = !empty($matches[1])?$matches[1]:''; 
        preg_match("/<title>((.|\n)*?)<\/title>/ims", $head, $matches); 
        $this->title = !empty($matches[1])?$matches[1]:''; 
        $html = preg_replace("/<head>((.|\n)*?)<\/head>/ims", "", $html); 
        $head = preg_replace("/<title>((.|\n)*?)<\/title>/ims", "", $head); 
        $head = preg_replace("/<\/?head>/ims", "", $head); 
        $html = preg_replace("/<\/?body((.|\n)*?)>/ims", "", $html); 
        $this->htmlHead = $head; 
        $this->htmlBody = $html; 
        return; 
    } 
     
    /** 
     * Write the content in the file 
     * 
     * @param String $file :: File name to be save 
     * @param String $content :: Content to be write 
     * @param [Optional] String $mode :: Write Mode 
     * @return void 
     * @access boolean True on success else false 
     */ 
    function write_file($file, $content, $mode = "w"){ 
        $fp = @fopen($file, $mode); 
        if(!is_resource($fp)){ 
            return false; 
        } 
        fwrite($fp, $content); 
        fclose($fp); 
        return true; 
    } 
}

 

Convert HTML to Word Document

Using the HTML TO DOC class, the following sample code converts HTML text to a Microsoft Word document and saves it as a.docx file.

1. Initialize and load the HTML TO DOC class.

// Load library 
include_once 'HtmlToDoc.class.php'// Initialize class 
$htd = new HTML_TO_DOC();

2. Enter the HTML content to be converted.

$htmlContent ' 
    <h1>Hello World!</h1> 
    <p>This document is created from HTML.</p>';

3. In order to convert HTML to a Word document, use the createDoc() method.

 - Set the variable containing the HTML text ($htmlContent).
 - To save the word file, choose a name for it (my-document).

$htd->createDoc($htmlContent"my-document");

Download word file:

Set the third argument of createDoc() to TRUE to download the word file.

$htd->createDoc($htmlContent"my-document"1);

Create Word Document from HTML File

By giving the HTML file name, you may convert the HTML file content to a Word document.

$htd->createDoc("source.html""my-document");

Word Document File Format

If no extension is specified in the file name supplied to the createDoc() function, the file will be stored in.doc format by default. To save a word document into a.docx file format, add an extension to the file name.

$htd->createDoc($htmlContent"my-document.docx");

Conclusion

There are several third-party libraries available for converting HTML to Word. However, you may use PHP to convert HTML information to Word Documents without the use of an external library. Our HTML TO DOC class converts dynamic HTML information to Word documents and saves/downloads as a.docx file using PHP. You may simply extend the functionality of the HTML To Doc class to meet your own requirements.

LICENSE OF USE

You can use it for personal or commercial projects. You can't resell it partially or in this form.

PRODUCT INFO

Create Date : Jan 25, 2022

Updated Date : Jan 25, 2022

Ratings

Comments : 0

Downloads : 0