Skip to content

Chapter 9: Device Agent

In the previous chapters, we built up the hub's communication layers piece by piece — IPC sockets in Chapter 3, MQTT cloud connectivity in Chapter 5, and the CDMB device bridge in Chapter 7. Now it's time to meet the component that ties them all together: the Device Agent.

Think of the Agent as the hub's brain. It sits between the cloud (AWS IoT Core) and the local device bridge (CDMB), translating commands, routing messages, and keeping device state in sync. When you say "turn off the kitchen light" through an app, the Agent is the process that receives that cloud command, translates it into something the physical device understands, sends it down to the CDMB, and reports the result back up.

What the Agent Does

At a high level, the Agent handles four responsibilities:

  1. Receives cloud commands — subscribes to MQTT topics for incoming actions (e.g., "set brightness to 50%")
  2. Translates data models — converts between the AWS data model the cloud speaks and the Matter data model devices speak
  3. Forwards requests to CDMB — sends translated commands to local devices via IPC
  4. Reports state changes upstream — publishes device notifications and state updates back to the cloud

Startup Lifecycle

The Agent's entry point is refreshingly simple. Everything begins in main.cpp:

// control/IoTSmartHomeDevice-Agent/stub/main.cpp
#include <service/iotshd_agent.hpp>

int main(int argc, char *argv[], char *envp[]) {
    return agentMainThread(argc, argv, envp);
}

The agentMainThread function creates an AgentServiceLauncher and kicks off a multi-phase startup. The launcher orchestrates everything through two methods — preLaunch() and launch():

// service/service_launcher.hpp (simplified)
bool preLaunch();  // Set up providers + managers
bool launch();     // Start all services

The preLaunch() phase follows a specific order, documented right in the header:

  1. Set up the ConnectionClientProvider (MQTT proxy + CDMB IPC clients)
  2. Set up the ManagerProvider (all business-logic managers)
  3. Create and configure each service

Only after all providers and managers are ready does launch() start the service threads. This ordering matters — services depend on managers, and managers depend on connection clients.

The Manager Provider Pattern

The Agent uses a provider pattern to give services access to shared managers. ManagerProvider is a singleton that lazily creates and owns every manager:

// manager/iotshd_agent_manager_provider.hpp
static ManagerProvider& getManagerProvider();
DeviceManager& getDeviceManager();
TranslationManager& getTranslationManager();
SchemaManager& getSchemaManager();
// ... and more

Each manager has a focused responsibility:

Manager Role
DeviceManager Tracks registered devices and their thing names
TranslationManager Converts between AWS ↔ Matter data models
SchemaManager Loads and caches device capability schemas
EventManager Manages event subscriptions and routing
TokenManager Handles authentication tokens for cloud communication
DbManager Persists device state to local storage
FileManager Reads configuration and schema files from disk
CapabilityManager Manages device capability definitions

When a service needs to translate a payload, it doesn't create its own translator — it asks the singleton:

// Inside CloudCommandService
TranslationManager& m_translationManager =
    ManagerProvider::getManagerProvider()
        .getTranslationManager();

This keeps services lightweight and ensures all components share the same state.

The Dual Data Model

Here's the core design challenge the Agent solves: the cloud and local devices speak different languages.

AWS data model — what the cloud understands:

Concept Description
Endpoint A controllable device (has an endpointId)
Capability A feature of that device (e.g., "Brightness")
Action A command to invoke (e.g., "SetBrightness")
Event A state change notification

Matter data model — what devices understand:

Concept Description
Node A physical device (has a nodeId)
Endpoint A functional unit within a node
Cluster A group of related features (e.g., "LevelControl")
Attribute A readable/writable property
Command An invocable operation

These two models map onto each other, but the vocabulary is different. An AWS Capability corresponds roughly to a Matter Cluster. An AWS Action maps to a Matter Command. The Agent's TranslationManager handles this mapping.

You can see the AWS model defined in AgentCommon/include/aws/:

// aws/capability.hpp
struct Capability : public ObjectType_t {
  StringField m_id{true, "id"};
  StringField m_name{true, "name"};
  StringField m_version{true, "version"};
  ArrayType_t<Action> actions_t{false, "actions"};
  ArrayType_t<Event> events_t{false, "events"};
};

And the Matter model in AgentCommon/include/matter/:

// matter/cluster.hpp
struct Cluster : public ObjectType_t {
  StringField m_id{true, "id"};
  StringField m_name{false, "name"};
  ObjectField commands_t{false, "cmds"};
  ObjectField attributes_t{false, "ats"};
};

Notice the field names are minified in the Matter model ("cmds", "ats") — this is intentional to reduce payload size for constrained devices.

The Translation Manager

The TranslationManager is the bridge between these two worlds. It exposes two primary translation directions:

// manager/iotshd_agent_translation_manager.hpp
// Cloud → Device (for incoming commands)
translateFromPublicToInternalDataModel(input, allocator);

// Device → Cloud (for outgoing notifications)
translateFromInternalToPublicDataModel(input, allocator);

"Public" means the AWS data model (what the cloud API uses), and "Internal" means the Matter data model (what the CDMB and devices use). The manager also handles special conversions for enums, bitmaps, and temperature units through dedicated converters:

converters::BitmapConverter m_bitmapConverter;
converters::EnumConverter m_enumConverter;
converters::TemperatureConverter m_temperatureConverter;

For a whole-payload translation (used in the command flow), there's a convenience method:

bool translateWholePayload(
    const std::vector<uint8_t>& in_payload,
    std::vector<uint8_t>& out_payload,
    converters::DataModelConversion conversion);

This takes a raw byte buffer in one model and produces a byte buffer in the other — exactly what the services need when shuttling messages between cloud and CDMB.

Services: The Workers

Services are the Agent's worker threads. Each one extends AgentService, which provides a thread-safe event queue and a processing loop:

// service/iotshd_agent_service.hpp (simplified)
class AgentService : public EventHandler {
  EventRequestQueue m_serviceQueue;
  virtual void processEvent(const EventHandlerRequest&) = 0;
  virtual bool configService() = 0;
};

Every service follows the same lifecycle: Created → Configured → Running → Stopped. The serviceThreadFunction blocks on its queue, wakes up when an event arrives, and calls processEvent().

Here are the key services:

Service What It Does
CloudCommandService Receives commands from the cloud, translates them, forwards to CDMB
CommandResponseService Receives CDMB responses, translates them, publishes back to cloud
NotificationService Handles unsolicited device events (e.g., a button press)
DeviceStateService Tracks and publishes device online/offline state
LocalDiscoveryService Discovers new devices on the local network via CDMB

Cloud Command Flow: End to End

Let's trace what happens when the cloud sends a "turn off the light" command. This is the Agent's most important flow, and it touches nearly every component we've discussed.

sequenceDiagram
    participant Cloud as AWS IoT Core
    participant CCS as CloudCommandService
    participant TM as TranslationManager
    participant CDMB as CDMB (IPC)
    participant CRS as CommandResponseService

    Cloud->>CCS: MQTT command (AWS model)
    CCS->>TM: translateFromPublicToInternal()
    CCS->>CDMB: send_cmd_to_cdmb_async()
    CDMB->>CRS: IPC callback with response
    CRS->>TM: translateFromInternalToPublic()
    CRS->>Cloud: MQTT publish (AWS model)

Let's walk through each step:

Step 1: Cloud delivers the command. The MQTT Proxy (from Chapter 5) receives a message on the device's command topic. The CloudCommandService is subscribed to this topic and the message lands in its event queue.

Step 2: Parse and validate. The CloudCommandService deserializes the AWS-model payload into a CommandRequest_t, checks that the target device is registered with the DeviceManager, and validates the request structure.

Step 3: Translate. The service calls the TranslationManager to convert the AWS payload (capabilities, actions, parameters) into the Matter payload (clusters, commands, fields). This is where "SetBrightness" becomes a LevelControl cluster command.

Step 4: Forward to CDMB. The translated payload is sent to the CDMB via the IPC bridge we saw in Chapter 3:

// connection_client/luma_cdmb_api.h
int send_cmd_to_cdmb_async(
    buf_len, seq, in_buf, rsp_cb, eventHandlerPtr);

This is an async call — it returns immediately. The CDMB will invoke the response callback when the device responds.

Step 5: CDMB executes. The CDMB (from Chapter 7) routes the command to the physical device over its local protocol (Matter, Zigbee, etc.) and collects the response.

Step 6: Response callback fires. The CDMB calls back with the device's response. This callback enqueues the response into the CommandResponseService's event queue.

Step 7: Translate back. The CommandResponseService uses the TranslationManager in the reverse direction — Matter model back to AWS model.

Step 8: Publish to cloud. The translated response is published back to AWS IoT Core via the MQTT Proxy, completing the round trip.

If anything goes wrong at any step, the CloudCommandService builds a failure response with an appropriate error code and publishes it directly to the cloud — the device never needs to know about cloud-side failures.

Notification Flow (Device → Cloud)

The reverse direction — a device reporting a state change — follows a similar pattern through the NotificationService:

  1. CDMB sends an unsolicited event via IPC callback
  2. NotificationService receives it, translates Matter → AWS
  3. The translated event is published to the cloud via MQTT
  4. The DeviceStateService is also notified to update its local state tracking

This is how the cloud stays in sync when someone physically flips a light switch.

Connection Clients: Two Directions

The Agent maintains two connection clients through the ConnectionClientProvider:

  • MqttProxyConnectionClient — talks upstream to AWS IoT Core via the MQTT Proxy (Chapter 5)
  • CDMBConnectionClient — talks downstream to the CDMB via IPC (Chapter 3, Chapter 7)

Every service grabs the client it needs from the provider:

// Inside CloudCommandService
ConnectionClient& m_cloudConnectionClient =
    ConnectionClientProvider::getConnectionClientProvider()
        .getMqttProxyConnectionClient();

ConnectionClient& m_cdmbConnectionClient =
    ConnectionClientProvider::getConnectionClientProvider()
        .getCDMBConnectionClient();

This two-client design mirrors the Agent's role as a bridge: one foot in the cloud, one foot on the local network.

Putting It All Together

Here's how all the pieces fit in the Agent's architecture:

                    ┌─────────────────────────────────────────┐
                    │              Device Agent                │
                    │                                         │
  AWS IoT Core ◄──►│  ┌─────────────────────────────────┐    │
  (MQTT Proxy)      │  │         Services                 │    │
                    │  │  CloudCommand  CommandResponse   │    │
                    │  │  Notification  DeviceState       │    │
                    │  │  LocalDiscovery                  │    │
                    │  └──────────┬──────────────────────┘    │
                    │             │                            │
                    │  ┌──────────▼──────────────────────┐    │
                    │  │     ManagerProvider              │    │
                    │  │  Translation  Schema  Device     │    │
                    │  │  Event  Token  DB  File          │    │
                    │  └──────────┬──────────────────────┘    │
                    │             │                            │
                    │  ┌──────────▼──────────────────────┐    │
                    │  │   ConnectionClientProvider       │    │
                    │  │  MqttProxy ◄──► CDMB IPC        │    │
                    │  └─────────────────────────────────┘    │
                    └─────────────────────────────────────────┘
                                        │ IPC
                                      CDMB
                                   (Chapter 7)

The layering is intentional: - Services handle business logic and message routing - Managers handle data transformation, state, and persistence - Connection clients handle raw communication

Each layer only talks to the one directly below it.

Key Takeaways

  • The Agent is the central orchestrator — it bridges cloud (MQTT) and local (IPC/CDMB) communication
  • The dual data model (AWS vs. Matter) is the core design challenge, solved by the TranslationManager
  • The provider pattern (ManagerProvider, ConnectionClientProvider) gives services shared access to managers and clients without tight coupling
  • Services are independent worker threads with their own event queues, following a consistent lifecycle
  • The cloud command flow is the most important path: Cloud → Agent → CDMB → Device → CDMB → Agent → Cloud

What's Next

The Agent handles devices that are already registered and connected. But how do new devices get discovered and provisioned in the first place? In Chapter 10, we'll explore the LPW Provisioner — the component responsible for onboarding new devices onto the hub's network and registering them with the cloud.