I have to parse a java file (really b .pdf file) for an String and return to personal files. Between individuals process I'll apply some patches towards the given string, but this isn't essential in this situation. I have developed the next JUnit test situation:

    String f1String=FileUtils.readFileToString(f1);
    File temp=File.createTempFile("deleteme", "deleteme");
    FileUtils.writeStringToFile(temp, f1String);
    assertTrue(FileUtils.contentEquals(f1, temp));

This test converts personal files to some string and writtes it back. Nevertheless the test is failing. It might be due to the encodings, however in FileUtils there's no much detailed information on this. Anybody might help? Thanks!

Added for more undestanding: Why I want this? I've large ebooks in a single machine, which are duplicated in a different one. The first manages creating individuals ebooks. Because of the reduced connectivity from the second machine and also the large size ebooks, I'd rather not synch the entire ebooks, only the alterations done. To produce patches/apply them, I am while using google library DiffMatchPatch. This library produces patches between two string. So I have to load a pdf for an string, apply a produced patch, and restore it to some file.

A PDF isn't a text file. Decoding (into Java figures) and re-encoding of binary files that aren't encoded text is asymmetrical.  For instance, when the input bytestream is invalid for that current encoding, you can rest assured it will not re-scribe properly.  In a nutshell - avoid that.  Use readFileToByteArray and writeByteArrayToFile rather.

Only a couple of ideas:

  1. There may really some BOM (byte order mark) bytes within the files that either will get removed when reading through or added throughout writing. It is possible to difference within the quality (if it's the BOM the main difference ought to be two or three bytes)?

  2. The road breaks may not match, depending which system the files are produced on, i.e. one may have CR LF as the other has only LF or CR. (1 byte difference per line break)

  3. Based on the JavaDoc both techniques should make use of the default encoding from the JVM, which ought to be the same for procedures. However, try to test by having an clearly set encoding (JVM's default encoding could be queried using System.getProperty("file.encoding")).

Erectile dysfunction Staub awnser points why my option would be no longer working and that he recommended using bytes rather than Strings. During my situation I want an String, therefore the final working solution I have found may be the following:

@Test
public void testFileRWAsArray() throws IOException{
    String f1String="";
    byte[] bytes=FileUtils.readFileToByteArray(f1);
    for(byte b:bytes){
        f1String=f1String+((char)b);
    }
    File temp=File.createTempFile("deleteme", "deleteme");
    byte[] newBytes=new byte[f1String.length()];
    for(int i=0; i<f1String.length(); ++i){
        char c=f1String.charAt(i);
        newBytes[i]= (byte)c;
    }
    FileUtils.writeByteArrayToFile(temp, newBytes);
    assertTrue(FileUtils.contentEquals(f1, temp));
}

Using a cast between byte-char, I've the symmetry on conversion. Thanks all!