There are a couple of simple things to try, whether you use read.table or scan.
- Set
nrows=the number of records in your data (nmaxinscan). - Make sure that
comment.char=""to turn off interpretation of comments. - Explicitly define the classes of each column using
colClassesinread.table. - Setting
multi.line=FALSEmay also improve performance in scan.
If none of these thing work, then use one of the profiling packages to determine which lines are slowing things down. Perhaps you can write a cut down version of
read.table based on the results.
The other alternative is filtering your data before you read it into R.
Or, if the problem is that you have to read it in regularly, then use these methods to read the data in once, then save the data frame as a binary blob with
save, then next time you can retrieve it faster with load.
No comments:
Post a Comment