Throw Warning When Reading File with Blank Lines#7707
Throw Warning When Reading File with Blank Lines#7707Asa-Henry wants to merge 7 commits intoRdatatable:masterfrom
Conversation
…t it doesn't solve the problem yet.
…kip' is greater than 0 when blank lines are present, so I also check if blank lines should be skipped so I can throw a warning to let the user know.
|
PR could have meaningful title |
|
please add a test case |
|
This is a good start. Can you run You can see that the tests don't pass. I think that tests 1578.1, 1578.2, 1578.6, 1883 will need to be updated for the new warning, possibly 1867.07 and 1867.11 too. Unfortunately, other tests show a regression:
Any ideas how the spurious warnings can be fixed? If you need to debug |
|
Thanks for this comprehensive response @aitap! This made it easier to understand what went wrong with my changes! In response to your question: it appears I introduced confusion between a file with intermittent blank lines and a file which has the header and the data separated by a blank line. For issues such as 1840.2, looks like I over looked how spaces come into play when parsing... I picked up on that |
… accomodate situation where the header and data are separated by a blank line.
…y a blank line. In the case of each line separated by a blank line, 'prevStart' is always NULL because each line could be a possible header.
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #7707 +/- ##
==========================================
- Coverage 99.04% 99.03% -0.01%
==========================================
Files 87 87
Lines 17037 17039 +2
==========================================
+ Hits 16874 16875 +1
- Misses 163 164 +1 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
What is the problem with the 'atime performance tests'? It's saying that |
That CI doesn't work on PRs from forks -- you can ignore it |
|
May I get some help writing a test? I'm trying to make a test using the input shown in the initial issue #3339. The following is what I have: input = "x y\n\n1 a\n\n2 b\n\n3 c"
test(
1578.10,
fread(input),
data.table(V1=3, V2=c),
warning="The rows in this file appear to be separated by blank lines. This resulted in most rows being skipped. If this was not the intended outcome, please consider setting 'blank.lines.skip' to TRUE."
)But then test 1578.1 fails. If I don't add my test, then all tests pass. |
|
Congratulations on figuring this out! (Sorry I didn't find the time to help.) Your test should work after you replace |
…nted out. Causes an error in test 1578.1?
|
Oh, right. Test numbers are ordinary doubles, so |
Adds a check to detect when blank lines are present in the file passed to
fread, but the user did not specify to skip blank lines. This change throws a warning to let the user know they may want to consider setting theblank.lines.skiptoTRUE.