-
Notifications
You must be signed in to change notification settings - Fork 733
TcpReassembly flowKey hash5Tuple fnvHash collision fix #2043
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: dev
Are you sure you want to change the base?
Conversation
e10839a to
34b0323
Compare
|
@yvyw thanks for working on this PR! |
Should fix the error: "Error occurred - packet doesn't match either side of the connection!!"
34b0323 to
d25bacd
Compare
|
Hi, @seladb! Thank you for PcapPlusPlus, it's very useful! The changes are intended to replace flowKey with the real connection object in order to eliminate collisions. Backward compatibility is not preserved because the About breaking into a smaller PR's. I think I can, but it will take time. Also, PRs would look strange without knowing the final purpose. Maybe it would be better to leave it as it is or split it into commits without creating PRs. What do you think? Regarding CI. The project passes the tests on my machine. I will fix the errors as they appear in CI. |
|
@yvyw replacing Sicne it's used in TCP Reassembly it should support high throughput and scale, and I think this implementation might cause degradation of both. Please let me know what you think. |
|
@seladb, your concerns are valid. Here's what I think: On calculations slowdown. On memory consumption. |
I have thought about doing this, unfortunately |
I agree. I think this is premature optimization according to Knuth, and it's better to just wait for C++17. In the comment, I wanted to say that there is some more room for memory optimization if it's required. I think projects with extremely high loads will encounter problems anyway and end up tailoring their code with a profiler. |
|
I experimented with CI errors, but I can't really understand what gcov wants. It's probably constructors, destructors, or operator= for the classes, or something else is implicitly declared in random places, which gcov does not like. |
Regarding |
|
@yvyw the
Doesn't it mean calculating Regarding the memory consumption: I looked more closely at the code, and I think I understand why the memory consumption is roughly the same 👍 |
Yes, otherwise we cannot be sure that we have found the same connection. It's a necessary evil when dealing with hash collisions. However, fitting all the data within the same CPU cache line can improve speed, but at the expense of memory.
I think it depends on the realization. We can calculate the hash when the object is constructed and then return the calculated value whenever it is needed. |
@seladb @yvyw Not really familiar with the tcp reassembly code, but just throwing my 2 cents here. I see that Could we use
Since the flow key duplication is an edge case, I think this would result in amortized O(1) on step 3, since most ranges will have 1 element. IMO, it should also require less overall changes from the original code, but that is for you decide as I haven't really dived deep. |
Fix #636. Should fix the error "Error occurred - packet doesn't match either side of the connection!!", if it is caused by hash collision.