A week ago as i was focusing on WYSIWYG - cKeditor. an issue arrived my thoughts. Can there be in whatever way to extract or take out this content of doc or docx file in to the blogger or wordpress text area. For example, we do not need to to choose and copy the written text or images from doc(x) file. All we have to do is give the file to WYSIWYG and content of doc(x) file is copied and pasted within the publish.

Any suggestion could be appreciated. Thanks Fawaz

Edit: Or else, see this plugin.

This wordpress plugin will process an submitted .docx file, removing all of the content like a publish.


I think you should use PHPWord to extract the items in .docx files.

(I ought to most likely point out that .docx files are simply .zip files having a specific structure Open Office XML)

However, it appears to become more devoted to writing .docx files rather than reading through.

There's class PHPWord_Template that contains this within the __construct:

$this->_objZip = new ZipArchive();
$this->_objZip->open($this->_tempFileName);

$this->_documentXML = $this->_objZip->getFromName('word/document.xml');

Which returns an XML document such as this:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<w:document xmlns:ve="http://schemas.openxmlformats.org/markup-compatibility/2006" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math" xmlns:v="urn:schemas-microsoft-com:vml" xmlns:wp="http://schemas.openxmlformats.org/drawingml/2006/wordprocessingDrawing" xmlns:w10="urn:schemas-microsoft-com:office:word" xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main" xmlns:wne="http://schemas.microsoft.com/office/word/2006/wordml">
  <w:body>
    <w:p w:rsidR="005B1098" w:rsidRDefault="005B1098"/>
    <w:p w:rsidR="005B1098" w:rsidRDefault="005B1098">
      ...
      <w:r w:rsidRPr="00F15611">
        <w:rPr>
          <w:rFonts w:ascii="Calibri" w:hAnsi="Calibri" w:cs="Calibri"/>
          <w:lang w:val="en-GB"/>
        </w:rPr>
        <w:t xml:space="preserve">The following table contains a few values that can be edited by the PHPWord_Template class.</w:t>
      </w:r>
      ...
  </w:body>
</w:document>

Which comes with the written text from the document inside it.

It appears like it will likely be lots of work that way if you wish to continue all of the formatting. Much more work than copying and pasting to some textfield.