Integrating Human Knowledge Through Action Masking in Reinforcement Learning for Operations Research
Clicks: 75
ID: 282428
2025
Article Quality & Performance Metrics
Overall Quality
Improving Quality
0.0
/100
Combines engagement data with AI-assessed academic quality
Reader Engagement
Emerging Content
10.2
/100
34 views
34 readers
Trending
AI Quality Assessment
Not analyzed
Abstract
Reinforcement learning (RL) provides a powerful method to address problems in
operations research. However, its real-world application often fails due to a
lack of user acceptance and trust. A possible remedy is to provide managers
with the possibility of altering the RL policy by incorporating human expert
knowledge. In this study, we analyze the benefits and caveats of including
human knowledge via action masking. While action masking has so far been used
to exclude invalid actions, its ability to integrate human expertise remains
underexplored. Human knowledge is often encapsulated in heuristics, which
suggest reasonable, near-optimal actions in certain situations. Enforcing such
actions should hence increase trust among the human workforce to rely on the
model's decisions. Yet, a strict enforcement of heuristic actions may also
restrict the policy from exploring superior actions, thereby leading to overall
lower performance. We analyze the effects of action masking based on three
problems with different characteristics, namely, paint shop scheduling, peak
load management, and inventory management. Our findings demonstrate that
incorporating human knowledge through action masking can achieve substantial
improvements over policies trained without action masking. In addition, we find
that action masking is crucial for learning effective policies in constrained
action spaces, where certain actions can only be performed a limited number of
times. Finally, we highlight the potential for suboptimal outcomes when action
masks are overly restrictive.
| Reference Key |
neumann2025integrating
Use this key to autocite in the manuscript while using
SciMatic Manuscript Manager or Thesis Manager
|
|---|---|
| Authors | Mirko Stappert; Bernhard Lutz; Niklas Goby; Dirk Neumann |
| Journal | arXiv |
| Year | 2025 |
| DOI |
DOI not found
|
| URL | |
| Keywords |
Citations
No citations found. To add a citation, contact the admin at info@scimatic.org
Comments
No comments yet. Be the first to comment on this article.