Skip to content

Conversation

@mini-1235
Copy link
Collaborator


Basic Info

Info Please fill out this column
Ticket(s) this addresses #4361
Primary OS tested on Ubuntu
Robotic platform tested on tb3
Does this PR contain AI generated software? No

Description of contribution in a few bullet points

Fix dead code as discussed in the issue

Description of documentation updates required from your changes

None

Description of how this change was tested

Tested on sim, cost of 253 works


Future work that may be required in bullet points

For Maintainers:

  • Check that any new parameters added are updated in docs.nav2.org
  • Check that any significant change is added to the migration guide
  • Check that any new features OR changes to existing behaviors are reflected in the tuning guide
  • Check that any new functions have Doxygen added
  • Check that any new features have test coverage
  • Check that any new plugins is added to the plugins page
  • If BT Node, Additionally: add to BT's XML index of nodes for groot, BT package's readme table, and BT library lists

@mergify
Copy link
Contributor

mergify bot commented Mar 19, 2025

This pull request is in conflict. Could you fix it @mini-1235?

@codecov
Copy link

codecov bot commented Mar 19, 2025

Codecov Report

Attention: Patch coverage is 92.85714% with 1 line in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
nav2_mppi_controller/src/critics/cost_critic.cpp 90.00% 1 Missing ⚠️
Files with missing lines Coverage Δ
...clude/nav2_mppi_controller/critics/cost_critic.hpp 81.25% <ø> (-5.42%) ⬇️
...2_mppi_controller/src/critics/obstacles_critic.cpp 81.17% <100.00%> (+0.92%) ⬆️
nav2_mppi_controller/src/critics/cost_critic.cpp 85.91% <90.00%> (-5.27%) ⬇️

... and 6 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Member

@SteveMacenski SteveMacenski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure that this is equivalent or as performant as before, unfortunately! I think a good chunk of this may need to be reverted

@mini-1235
Copy link
Collaborator Author

Moving #4996 (comment)

@SteveMacenski Thanks for looking into this PR!

but I'd like to have some validation that when the code we're talking about is triggered, its only in that 255 case (that we're not all mistaking something that really happens from a static analysis).

Before submitting the PR, and again after your comment, I ran several experiments to better understand when this code segment is actually triggered. Let me share a little bit for my findings and my ideas behind this PR:)

Interestingly the code coverage metrics show coverage in the supposedly dead code segment. I suspect this happens when the pose_cost is UNKNOWN because its off the costmap which is greater than 253.

I agree that this is not a 100% dead code, and maybe it is more suitable to call it as a logic error here, sorry for the confusion.
Based on my observations, the code segment would be triggered in three cases:

  1. As you mentioned: "when the pose_cost is UNKNOWN because its off the costmap", then the pose cost is 255, >= 253
  2. When we set consider footprint as true in CostCritic, but using robot radius in the costmap instead of footprint(default nav2 params), then
    switch (static_cast<unsigned char>(score_cost)) {
    case (nav2_costmap_2d::LETHAL_OBSTACLE):
    return true;
    case (nav2_costmap_2d::INSCRIBED_INFLATED_OBSTACLE):
    return consider_footprint_ ? false : true;
    case (nav2_costmap_2d::NO_INFORMATION):
    return is_tracking_unknown_ ? false : true;
    }

    253 would not be considered as in collision, therefore triggering the code segment
  3. Point cost returns a 253, and footprint cost also returns a 253, this is very rare, and I didn't notice this when submitting the PR(sorry for that)

Before proceeding further with this PR, there are two points I would really appreciate your insight on:

  1. We dont want to use the footprint cost for the cost to apply, only the collision checking process itself

Can you elaborate more on this? From the documentation side, I see that critical cost is described as "Cost to apply to a pose with any point in inflated space to prefer distance from obstacles.", That description made it seem reasonable to me that we should consider footprint cost during scoring. Also, using the footprint cost of 253 helps penalizes the near collision trajectory points, as the comment suggested.

  1. Does point cost of 253 always result in footprint cost of 254?
    My setup:
footprint: "[ [0.21, 0.195], [0.21, -0.195], [-0.21, -0.195], [-0.21, 0.195] ]"

I ran this in a tb3 sim environment, and from the log, I saw when pointcost returns a 253, the footprint cost(aka max linecost) returns:

[component_container_isolated-8] line cost 158
[component_container_isolated-8] line cost 220
[component_container_isolated-8] line cost 253
[component_container_isolated-8] line cost 251

I am a bit confused by this, and this is the third condition I mentioned above(I have always thought that if point cost returns 253 then footprint cost is certainly 254, resulting in colision)

If those questions are just my misunderstandings, I agree to close this PR:)

@SteveMacenski
Copy link
Member

When we set consider footprint as true in CostCritic, but using robot radius in the costmap instead of footprint(default nav2 params), 253 would not be considered as in collision, therefore triggering the code segment

Oh interesting. I think we should probably log a warning in this case, this should be loudly mentioned to make sure folks know that is the case that the robot information in the local costmap and settings in consider_footprint are not aligned (in both cases if (costmap_ros_->getUseRadius() == consider_footprint) {// Problem!!}. Probably for the Obstacle critic too.

We dont want to use the footprint cost for the cost to apply, only the collision checking process itself

Basically if we include the footprint cost, anywhere that the robot base has any point in the inflated 253 space, its always scored as 253 (max valid when using footprints). So if just one point on the robot is in that space, its the same as the robot driving up against a wall. Instead, if we score based on the center point-costs, then there's a penalization of proximity of the robot against the wall all the way until its basically touching. Also, that penalty is then directly scaling by the inflation layer's exponential function users tune - which could raise dramatically near obstacles or more gradually over space (so its tunable within that close-to-obstacles regime).

This was a big reason that the Cost Critic was added and replaces the Obstacles Critic -- that used the distance which succomes to that exact problem always since the distance calculated from cost will always have the distance as the same when the footprint is touching 253 space.

Maybe the docs need updating / clarification. Does that make sense?

Does point cost of 253 always result in footprint cost of 254?

Generally, yes, except for discretization situations where you're just barely touching the corner of a costmap cell that is 253 for the center cost and the ray-marching on the footprint's resolution is coarse enough to miss the bit that might in 254. Generally speaking, yes, if its "in" the inflated 253 space, the footprint should be 254. The exception to that is if the center point is just barely skirting the edge of the 253 inflated space, it can sometimes also return 253.

if you run that again and have the robot squarely in the 253 inflated space, you should always see 254.

@mini-1235
Copy link
Collaborator Author

The exception to that is if the center point is just barely skirting the edge of the 253 inflated space, it can sometimes also return 253.

Should we describe this kind of situation as near collision or in collision?

@SteveMacenski
Copy link
Member

SteveMacenski commented Mar 27, 2025

Basically the 2 edge situations to think about are:

  • The robot's pose is chipping the corner of the inflation radius past actual collision by some fractional amount of the cell resolution -- so it says there's a collision, but technically its not in collision by some very small amount. A center-check would result in a collision but if you actually checked the polygon it may not be (but barely)
  • When you're near an obstacle and do a polygon check, it could conceivable for an irregular shaped obstacle, or an obstacle misaligned with the costmap's major X and Y axes, there are part of the polygon that collide with just a tiny amount of an occupied cell and says its not in collision (but still 253). This is because we use an iterator to march over the line segment at a resolution and that resolution might miss that tiny amount. We could reduce the resolution to be infinitesimally small, but that comes then at an infinite computational cost. So a trade off has to be made for reason, which I believe it checking every 0.5 costmap cell resolution (so 2x checks per cell, on average).

Then the obvious situations where the robot is in the collision by center cost and actually is; and robot's footprint collides with obstacle and we detect it.

The edge cases are created by the fact that we're representing continuous space by a binning quantization (occupancy grid square cells), which both (a) obstacles can be at an arbitrary angle with respect to and (b) the footprint of the robot can be at any angle with respect to. There's some aliasing as an artifact of that quantization resolution choice. In actuality for the first case, the center point cost is the one that is wrong, not the polygonal check, since the far corner of the square cell is outside of the actual inflation radius (the radius is computed by cell center distances). In the second case, its up in the air about whether or not that could be considered slightly not a collision or slightly in collision, but either way they would be scored as highly costed or invalid, respectively. Its hard to say whether (a) the obstacle is misrepresented by the cell's aliasing or bininng of sensor data or (b) the iterator marching resolution barely misses detecting the colliding cell.

Funny enough, questions like this and things like this are where its really obvious how much the graphics community and robotics overlap! I've taken inspiration from the graphics community a few times over the years in making robotics systems :-)

Sorry, I know that was alot. But tl;dr is I think we should treat it as a near collision, since its likely just an artifact of square-cell aliasing angles and/or binning quantization

@mini-1235
Copy link
Collaborator Author

Sorry, I know that was alot. But tl;dr is I think we should treat it as a near collision, since its likely just an artifact of square-cell aliasing angles and/or binning quantization

No worries at all, I actually love hearing this kind of breakdown. Super insightful, and that really clarifies my questions. I think there are two things to be done based on #4361 and your explanations in this PR.

  1. Calculate the actual distance (this is not discussed in this PR and we might need to implement other costmap layers)

  2. Regarding this code segment

    if (pose_cost >= 253.0f /*INSCRIBED_INFLATED_OBSTACLE in float*/) {
    traj_cost += critical_cost_;
    } else if (!near_goal) { // Generally prefer trajectories further from obstacles

    This is actually not dead code, it will trigger on 1. 253(point)+253(footprint)+consider footprint 2. 255 + is tracking unknown case, so we can actually keep this, unless you prefer to make it as a tunable option as you suggested earlier

have that check be changed to be useful by parameterizing the cost, above which the critical_cost is applied (which defaults to something like 254 to be disabled) as a tunable option for users to set a max proximity before which it should avoid even if non-collision inducing

However, we could/should do these things to cost critic:

  • Warn users when settings are not aligned

I think we should probably log a warning in this case, this should be loudly mentioned to make sure folks know that is the case that the robot information in the local costmap and settings in consider_footprint are not aligned (in both cases if (costmap_ros_->getUseRadius() == consider_footprint) {// Problem!!}. Probably for the Obstacle critic too.

  • Move the inCollision outside the else block for better readability

The inCollision should be moved out of the else statement so that both cases of worldToMapFloat result in checking on the pose cost.

What are your thoughts on these? I think I can close this PR and open a new one for the second point?

@SteveMacenski
Copy link
Member

SteveMacenski commented Mar 27, 2025

This is actually not dead code, it will trigger on 1. 253(point)+253(footprint)+consider footprint 2. 255 + is tracking unknown case, so we can actually keep this, unless you prefer to make it as a tunable option as you suggested earlier

I think keep it and make it tunable with 253 as the default would be good! I don't think that would cause any issues with your assessment of the (3) cases for 253 being triggered to possibly lower it. I would add in a warning log if its set > 253

Warn users when settings are not aligned

Agreed! 👍

Move the inCollision outside the else block for better readability

Agreed! 👍

A new PR probably makes sense! Though, you could just revert this work and force-push on this branch, if you like. I'm up for either :-)

@mergify
Copy link
Contributor

mergify bot commented Mar 28, 2025

@mini-1235, your PR has failed to build. Please check CI outputs and resolve issues.
You may need to rebase or pull in main due to API changes (or your contribution genuinely fails).

Signed-off-by: mini-1235 <[email protected]>
@mergify
Copy link
Contributor

mergify bot commented Mar 28, 2025

@mini-1235, your PR has failed to build. Please check CI outputs and resolve issues.
You may need to rebase or pull in main due to API changes (or your contribution genuinely fails).

Copy link
Member

@SteveMacenski SteveMacenski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Just sanity check performance for me? If you add some print logs here when the critic is the cost critic, does the performance meaningfully change?

@mini-1235 mini-1235 changed the title Fix cost critic dead code Add near collision cost and warnings for misaligned parameter settings in MPPI critics Mar 29, 2025
@mini-1235
Copy link
Collaborator Author

If you add some print logs here when the critic is the cost critic, does the performance meaningfully change?

I have tested with default params, no performance change

Link to docs: ros-navigation/docs.nav2.org#665

Copy link
Member

@SteveMacenski SteveMacenski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor change and LGTM!

@mini-1235
Copy link
Collaborator Author

Should we backport this PR (except for the near collision cost change) to jazzy/humble?

@SteveMacenski SteveMacenski merged commit e90717d into ros-navigation:main Apr 1, 2025
14 checks passed
@SteveMacenski
Copy link
Member

@mini-1235 we can backport fully to Jazzy since the default behavior of the parameter is the same. Humble I agree we should only backport the checks (I'm actually not sure if getUseRadius is in Humble or not though?).

If you open those, I can merge those easily :-)

Thanks for the back and forth and great analysis!

@mini-1235
Copy link
Collaborator Author

Hi @SteveMacenski, I have backported it for both Jazzy and Humble, and I have tested it locally.

I am not entirely sure why they are failing, I guess the CI is for Rolling only(?)

@mini-1235 mini-1235 deleted the fix-cost-critic branch April 1, 2025 21:41
@SteveMacenski
Copy link
Member

Yeah its not currently setup for non-rolling builds at the moment

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants