I am focusing on a crawler that opens files and parses them and puts this content right into a database.

However I have had an issue with files that consists of odd chars, and I'm wondering if there's any simple method to enforce ANSI-encoding from the string before I place it in to the database, to make certain that there's no illegal chars.

The project is designed in C#, and also the code I personally use to place stuff in to the database is really as following:

cmd = new OleDbCommand("INSERT INTO TaIndex (IndexId, IndexTekst, IndexDato, IndexModulId, IndexModul, IndexFilsti) VALUES (?, ?, ?, ?, ?, ?);", conn);
cmd.Parameters.Add("IndexId", OleDbType.Integer).Value = newIdGetter();
cmd.Parameters.Add("IndexTekst", OleDbType.LongVarChar).Value = Text;
cmd.Parameters.Add("IndexDato", OleDbType.Date).Value = DateTime;
cmd.Parameters.Add("IndexModulId", OleDbType.VarChar).Value = ModuleId;
cmd.Parameters.Add("IndexModul", OleDbType.VarChar).Value = Module;
cmd.Parameters.Add("IndexFilsti", OleDbType.VarChar).Value = ((object)FilePath) ?? DBNull.Value;

The issue is using the IndexTekst-area, which will come in the files.

Well, you can always make sure that the string could be encoded after which re-decoded towards the same return:

public static bool CanBeRoundTripped(Encoding encoding, string text)
    byte[] bytes = encoding.GetBytes(text);
    string decoded = encoding.GetString(bytes);
    return text == decoded;

Call that on each text area before saving it - after which consider how to proceed whether it fails...

Can there be any method for you to alter the database schema to simply accept all Unicode figures? That might be an even more enjoyable approach, IMO.

Should you do want to use some kind of ANSI encoding, you need to exercise exactly which encoding you mean. You will find plenty of encodings which can be known as "ANSI". You have to figure out which code page you mean.

You could attempt this:

cmd.Parameters.Add("IndexTekst", OleDbType.LongVarChar).Value = Encoding.Default.GetString(Text);

Or possibly particularly convert it between your different encodings with Encoding.Convert()