Trying to split at pattern,but error occur:regular expression is too big!THen I split at ([\x00-\x7f][^\x00-\x7f]*),after this many:1000,this time the rubbish character dissapear,(but little files still appear ??? when reconverted it UTF-8 to BIG5.)
after splitting the files will have 50000 characters per file,that is not the standard I want.after counting the files,they seem to irregular.some of them have only 1 line,some are blank,some are around 1000 ,but only when "after this many:30" can create file per aound 1000,but not exactly as some are blank ,some are more or less than other,some have 100 characters only,some have 50 to 60 characters;if using"after this many:1000",,then the file will contain over 50000 characters),why would be like that?
really thanks your reply
