Back to 5.1 User Interface (GUI)

Implementation

The GUI was built over seven iterations across 18 weeks (December 2025 to April 2026). Each iteration had a defined objective, produced working software, and closed with a review before the next iteration began. This section documents the development timeline, the system architecture, and the key engineering decisions that shaped the final system.

V-Model alignment: This page corresponds to the base of the V, where decomposed GUI concepts are integrated into a single, testable subsystem build.

The implementation story is still a product story. Each release turned the GUI from a set of selected concepts into something a boxer could actually use in training, with the architecture serving the user flow rather than the other way around.

Seven-Iteration Build Timeline Each iteration adds one layer of training flow, integration, or polish. 1 Shell PySide6, nav, Jetson 2 Curriculum combos, session UI 3 Sparring Markov, assessment 4 Performance tests, sensors 5 Navigation stack, AI coaching 6 User mgmt login, validation 7 Dashboard ROS 2, phone, design

Development Iterations

The iteration plan followed the product development sequence from Chapter 3: select the concept, build the minimum viable version, check it against user needs, then refine the training flow before moving to the next layer.

A key product insight emerged from user testing at the Robotics Meets AI Showcase in late January 2026 (mid-Iteration 3). Interviews with boxers of varying skill levels revealed distinct patterns in how beginners, intermediate, and advanced practitioners approached the equipment. These observations directly informed the design of the proficiency assessment—a 6-question checklist administered during signup that classifies users into three tiers based on their responses. This user-centred approach meant the assessment reflected real training behaviour rather than arbitrary difficulty levels.

System Architecture

The application follows a five-layer architecture. Each layer has a single responsibility, so changes in one layer do not cascade into unrelated parts of the codebase.

From the user's perspective, that architecture exists to keep the training experience smooth. A user should see a stable drill flow and fast feedback, even when the underlying data, integration, or session logic changes.

Presentation Layer

PySide6 widgets: all pages, buttons, labels, input fields, event callbacks, visual feedback

Application Layer

Navigation stack and page history, user session state, configuration management

Business Logic Layer

Combo Curriculum Engine (mastery algorithm), Performance Testing Logic, User Management, Proficiency Assessment

Integration Layer

GuiBridge (ROS 2 QThread bridge), CV interface (ROS topics + file-based fallback), Robot arm control (ROS command topics), Phone dashboard (shared SQLite + JSON command file)

Data Layer

SQLite databases (per-user files), configuration file management

The Integration Layer is the critical enabler for parallel development. Rather than calling hardware interfaces directly, all hardware communication is routed through this layer. During development on a Windows laptop with no hardware connected, the layer switches automatically to mock interfaces that simulate hardware responses. When deployed on the Jetson Orin NX, the mock interfaces are replaced by real implementations without any changes to the layers above.

That separation mattered because the product had to keep moving while the rest of the robot was still being developed. The GUI could be tested with realistic training flows before the physical system was fully available.

ROS 2 Integration

The GUI communicates with the broader BoxBunny system through a GuiBridge interface that receives session events and sends user commands. This keeps interface behavior responsive while preserving a clean boundary between user interaction flow and backend processing. From the user's perspective, this enables live updates for punch confirmations, drill progress, session state, and coaching prompts.

In product terms, the bridge is what makes the GUI feel alive during training. The user sees the session react immediately, rather than feeling like they are waiting on a separate machine in the background.

During laptop-only development, the same interface runs in mock mode so the GUI can be validated without connected hardware. Internal node architecture, transport details, and backend implementation are documented in Section 5.3.

IMU Navigation

A key usability constraint identified during needs finding is that boxers cannot reliably operate a touchscreen while actively training. Research on touch input shows that even moderate hand coverage significantly reduces accuracy on targets below 48px (Parhi, Karlson and Myers, 2006). The GUI is therefore designed for normal touchscreen use with bare hands, while the IMU pads provide the training-time control path.

This feature exists because the training product has to work under movement and fast session changes. The navigation model was chosen to keep the boxer focused on the drill instead of on the screen.

Interactive IMU Navigation Cue Click a pad to preview the direction label and the active state. Press a pad to preview the navigation response GUI training-time control path Up / Head Right Left Confirm

The four pads double as navigation controls outside of training sessions. Left pad moves to the previous item, right pad moves to the next, centre pad confirms a selection, and head pad navigates back. This mapping follows the directional convention familiar from physical four-button controllers, which reduces the need for users to learn new interaction patterns (Norman, 2013).

One edge case is the home screen, which has no parent to return to. On the home screen, the head pad opens a quick-access preset overlay instead of attempting a back navigation with nowhere to go.

During active training, all pad navigation is disabled. Without this, punch impacts on the pads would trigger accidental page changes mid-session. The disable state activates during the countdown and active phases of every session and restores automatically during rest and after session completion.

Session Orchestration

The GUI is the session lifecycle controller for every training mode: technique drills, sparring, free training, and performance tests. Every session begins with an explicit start request from the GUI and ends with an explicit end request that retrieves the session summary.

From the product point of view, this is the core training loop. The user configures a drill, starts it, receives live state feedback, and ends with a summary that supports the next training decision.

Typical Session Flow A training session moves from setup to live control to results review. Configure mode, level, duration Start session ID issued Run countdown, active, rest Finish summary returned Review results, next-session choice

The sequence on a typical session is as follows. The user configures the session and taps Start. The GUI sends a request to the backend with the training mode, difficulty, and username, and receives a session ID in return. Throughout the session, state changes arrive from the backend: countdown, active round, rest period, and next round. The GUI uses these to drive the on-screen timer, round counter, and screen transitions. When the final round ends or the user exits early, the GUI sends an end request with the session ID. The backend returns a summary containing punch counts, accuracy scores, and performance metrics, which the results page then displays.

This separation means Section 5.3 (Robot Intelligence) handles scoring and data collection logic, while Section 5.1 (GUI) handles all presentation and user flow logic. The GUI does not need to know how punch scoring is calculated, and the backend does not need to know how results are displayed.

Phone Dashboard

A companion phone dashboard gives users a quick way to check progress and make light training adjustments without walking back to the robot. It is most useful between rounds or after a session, when the boxer wants a fast summary rather than a full touchscreen interaction. More detailed integration notes are in Section 5.3.

Gamification System

The rank and badge system gives users visible progress after a session, which helps keep training feeling rewarding instead of repetitive. The detailed scoring and data model are covered in Section 5.3.

Key Engineering Decisions

Navigation Stack

The application contains over 40 pages managed by a central QStackedWidget. Rather than coding back button destinations manually on each page, a navigation stack records the current page index before every transition. The diagram below illustrates how the stack operates as a user navigates through a typical training flow.

Navigation Stack: Push on Forward, Pop on Back Forward Navigation (push) Login Main Menu Training Combo Select Session Stack after arriving at Session: Bottom ← [ Login, Main Menu, Training, Combo Select ] → Top Back Navigation (pop) Session Combo Select pop: Combo Select Training pop: Training Main Menu pop: Main Menu When the stack is empty, pressing Back falls back safely to the Main Menu. This eliminates manual per-page back button wiring across all 40+ pages.

Proficiency Assessment

On signup, new users complete a six-question checklist. Each question has three answer options scored 0, 1, or 2. The total score (0 to 12) maps the user to a proficiency level, with the option to override the suggestion before confirming.

Question Option 1 (0) Option 2 (1) Option 3 (2)
Have you trained boxing before? Never A few times Regularly
Do you know the basic punches? No Somewhat Yes
Can you throw a basic 1-2-3 combo? No With help Yes
Have you done any sparring before? Never Once or twice Yes, regularly
How would you describe your fitness level? Low Moderate High
Have you used boxing equipment before? Never A few times Regularly

Table 5.1.2-3: Proficiency Assessment Questions

Total Score Suggested Level
0 to 4Beginner
5 to 8Intermediate
9 to 12Advanced

Table 5.1.2-4: Proficiency Assessment Scoring

The user can accept or override the suggestion on the result page before confirming. This classification determines the default combo difficulty tier shown on the training page.

Combo Curriculum and Mastery Algorithm

The curriculum contains 50 combinations: 15 Beginner, 20 Intermediate, and 15 Advanced. Each combo uses a notation system where numbers 1 through 6 represent punch types (Jab, Cross, Lead Hook, Rear Hook, Lead Uppercut, Rear Uppercut), with a "b" suffix for body shots and text labels for defensive movements (slip, block, roll). A combo is considered mastered when the user has completed at least five sessions with an average score of 3.0 out of 5.0 or above. This threshold-based progression model draws on mastery learning theory, which holds that learners should demonstrate competence at one level before advancing to the next (Bloom, 1984). Progress is tracked per-user in the SQLite database and persists across sessions. The self-select sequence builder interface is shown in Appendix 3.

Sparring Mode: Markov Chain Generation

Sparring mode generates punch sequences using a first-order Markov chain. A first-order Markov chain is a probabilistic sequence model in which the probability of each next state depends only on the current state (Norris, 1997). This property makes it suitable for real-time punch sequence generation within the Jetson Nano's compute budget, as no sequence history needs to be stored or evaluated. Each boxing style (Pressure Fighter, Counter Puncher, Infighter, Out-Boxer, Random) defines a transition probability matrix over punch types. Starting from a "start" state, the system picks the next punch based on weighted probabilities, continuing until an "end" state is reached or a safety limit of six punches per combo is hit. Sparring is available to all proficiency levels. The style selection interface is shown in Appendix 3.

Below is the transition matrix for the Counter Puncher style as an example. Each row shows the current state, and the columns show the probability of transitioning to each next punch (or ending the combo).

From State 1 (Jab) 2 (Cross) 3 (L.Hook) 4 (R.Hook) 5 (L.Upper) 6 (R.Upper) End
start0.300.300.40
1 (Jab)0.400.60
2 (Cross)0.300.200.50
3 (L.Hook)0.300.70
4 (R.Hook)0.200.80
5 (L.Upper)0.100.90
6 (R.Upper)1.00

Table 5.1.2-6: Counter Puncher Markov Transition Matrix

Worked example (Counter Puncher):

1. start: Roll probabilities. Jab (0.30), Cross (0.30), End (0.40). Suppose the roll picks Jab.
2. After Jab: Cross (0.40), End (0.60). Suppose the roll picks Cross.
3. After Cross: Lead Hook (0.30), Rear Hook (0.20), End (0.50). Suppose the roll picks End.

Result: the generated combo is 1-2 (Jab, Cross). The Counter Puncher style has high "end" probabilities, so it naturally produces short, precise combinations. Contrast this with the Pressure Fighter, whose low "end" probabilities generate longer, more aggressive sequences.

When a user's weakness profile is available (from previous sparring sessions), the transition weights are blended with a bias multiplier. The blending factor alpha increases gradually with session count (capped at 0.4), so the adaptation strengthens over time without fully overriding the style's character.

AI Coaching Integration

Users benefit from feedback that appears when it is most useful: quick prompts during training and a more reflective chat view after a session. That keeps the coaching feel light and readable while still giving the boxer a clear next step. More detail on the AI coaching pipeline is in Section 5.3.

Sound System

Audio feedback reinforces task events without requiring the user's visual attention, which is useful during active training when the user is focused on the pads rather than the screen (Gaver, 1989). The GUI uses 18 WAV sound effects, all preloaded at startup for zero-latency playback.

A priority tier system prevents lower-priority sounds from masking higher-priority ones. Round start and end bells carry the highest priority among session sounds. Countdown ticks sit below bells, hit confirmation sounds sit below ticks, and UI navigation clicks sit at the lowest priority. A navigation click cannot interrupt a bell, but a bell will always cut through regardless of what else is playing. Volume and per-sound toggles are available from the Settings page.

Feature Set

The complete feature set delivered across all seven iterations is summarised below.

References