Narrative Motion Blocks

Sam Bourgault, Li-Yi Wei, Jennifer Jacobs, and Rubaiat Habib Kazi. 2025. Narrative Motion Blocks: Combining Direct Manipulation and Natural Language Interactions for Animation Creation, DIS 2025, Madeira, Portugal. doi: https://doi.org/10.1145/3715336.3735766 🏆 Best Paper Award

Authoring compelling animations often requires artists to come up with creative high-level ideas and translate them into precise low-level spatial and temporal properties like position, orientation, scale, and frame timing. Traditional animation tools offer direct manipulation strategies to control these properties but lack support for implementing higher-level ideas. Alternatively, AI-based tools allow animation production using natural language prompts but lack the fine-grained control over properties required for professional workflows. To bridge this gap, we propose aniMate, a hand-drawn animation system that integrates direct manipulation and natural language interaction. Central to aniMate are Narrative Motion Blocks, clip-like components located on a timeline that let animators specify animated behaviors with a combination of textual and manual input. Through an expert evaluation and the creation of short demonstrative animations, we show how focusing on intermediate-level actions provides a common representation for animators to work across both interaction modalities.

This work was done as part of an internship at Adobe Research during the summer of 2024 in collaboration with Li-Yi Wei, Jennifer Jacobs and Rubaiat Habib Kazi.

Acknowledgement: We want to thank the four animators who participated in our formative steps and expert evaluation tasks, James Ratliff, Val Head, Rima Cao, and Seth Walker. We also want to thank Emilie Yu and Ana Maria Cardenas Gasca for helping us pilot our study. We appreciate the help from our colleagues at the Expressive Computation Lab at UCSB and at Adobe Research, who contributed valuable feedback to improve this work. Lastly, thank you to Matthew Beaudouin-Lafon for our long conversations on generative AI and creativity and feedback during the system development.

Jump to:
Context
Approach
Overview
Workflow
Recreation Examples
Expert Evaluation

Presented at

DIS 2025, Madeira, Portugal

Context

Authoring compelling animations requires artists to come up with creative high-level ideas and translate them into precise low-level spatial and temporal properties. For instance, the narrative idea “a ball rolls towards the cat” is represented by sequences of position and rotation values changing across time frames.

high level idea

low-level properties

“a ball rolls towards the cat”

Traditional animation tools offer direct manipulation strategies to precisely control these properties but lack direct support for implementing higher-level ideas and often involve a complex UI. Alternatively, natural language-based tools allow the generation of stylized and realistic video using natural language prompts, usually through a simple input text box interface, but lack the fine-grained control over properties that is both familiar and necessary to professional animation workflows.

direct manipulation tools

natural-language-based tools

precise authoring control ✅
supports unique style of expression ✅
familiar to professionals ✅

interaction separate from narrative goals ⛔️
complex user interface ⛔️

narrative approach based on natural language ✅
simple user interface ✅

challenging to describe spatial and temporal information ⛔️
outcomes don't meet intentions ⛔️
limited sense of agency ⛔️

While low-level direct manipulation and high-level natural language specifications involve fundamentally different interaction models, prior research suggests that combining these modalities can enhance creative tasks by better aligning with users’ intentions and increasing the AI spatial awareness (Masson, DirectGPT, 2024; Kim, Stylette, 2022).

Approach

We propose that a bidirectional strategy that enables the representation and editing of animated content both iconically and symbolically over time could provide the flexible interactions necessary for animation. To investigate the combination of direct manipulation control and natural language, we introduced Narrative motion blocks. The narrative motion blocks are clip-like segments located on a timeline that enable animators to prompt an LLM (GPT-4o) for generating motions of visual elements on a canvas. Each block represents the motion of a specific visual element over a defined number of frames, determined by its length.

We developed the web application, AniMate, to explore the opportunities of the Narrative motion blocks for animation creation. AniMate consists of 1) a canvas, where animators can draw and manipulate visual elements and motion paths, 2) a timeline, where animators create narrative motion blocks and modify their temporal properties, and 3) a panel displaying the dynamic user interface generated by the system for each block.

Overview

A) The animator creates a narrative motion block and uses natural language to specify an action (e.g. roll tomato

). The system processes the request and generates a set of custom sliders parametrizing the action. B) If the animator wants to add irregularities in the motion path (e.g. to avoid a rock

), C) they can directly move the points of the tomato's motion path around the rock.

D) The animator can also draw the motion path directly by moving the visual element (e.g. the butterfly

), which automatically generates a block. E) To add specificity, stylization or secondary motion to the animation, the animator can edit the block by using natural language input to generate additional controls (e.g. add loops, and align to path). F) Using the newly generated sliders, the animator can modify the new effects (e.g. change the amplitude of the loops).

Under the hood, we used the LLM: 1) to look through some template functions associated with simple intermediate-level actions like move and rotate, 2) combine template functions together when necessary, or 3) generate novel functions. We then combine the data generated by the LLM with the direct manipulation data.

Workflow

1. Visual elements creation: The animator can draw object using a brush tool with adjustable thickness and color.

They then use one of three methods to create a narrative motion block.

2-a. Adding a narrative motion block directly: The first method consists of directly adding a new block to the timeline by clicking the + button. This opens up a menu of existing block templates. Here the animator selects the action+object template, which creates a dropdown to select recurrent actions followed by a second dropdown to select an object on the canvas.

By default this template is set to “move” the first object drawn on the canvas. But The animator can choose another action or write their own. Here they select to roll the smiling face. And they press ENTER to trigger the system to generate a rolling action. Once the request is completed, the system generates a motion path and a corresponding dynamic UI panel.

Once an animation is generated, the animator can make adjustments in two ways: by using the sliders in the dynamically generated UI or by manually altering the generated motion path. Here they are changing the traveled distance, the rotating angle, and the ease in and out values in the panel.

The animator can also write a custom action. Here they want the smiling face to “shake”.

They are however not satisfied with the resulting animation. So they modify the narrative motion block to add more specificity to the action and request specific controls. They request an adjustable frequency to control the speed of the shake, which the system generates as a slider tunable by the animator.

2-b. Animating a visual element directly to create a new narrative motion block: A second method to create a narrative motion block is to animate a visual element directly in the canvas. Here the animator traces the butterfly motion path manually. This action automatically generates a new block on the timeline.

While preserving the motion path, the animator can ask the system for further modifications. Here they ask for alternating between two butterfly poses and to align the butterfly elements to the direction of the path. This creates the effect of a frame-by-frame animation.

2-c. Duplicating existing narrative motion block: Finally, the animator can duplicate an existing block and modify the parameters of the new block. They can also change the object it is linked to. Here they relink the new block to green sun rays.

By duplicating the initial sunrays’s block, offsetting the copy in time, and changing the final scale, they are able to create a background for the animation.

Recreation Examples

We evaluated our system by making three demonstrative examples. Beyond the example I just walked you through, we recreated a sushi animation by Clemens Makoschitz to test our system’s ability to animate visual elements that appear to interact with one another.

Original sushi animation by Clemens Makoschitz

Jumping sushi recreated with AniMate

We also recreated the abstract cylinder animation by RAPAPAWN studio to attempt the creation of motion graphics and narratives that extend our templated options.

Original animation by RAPAPAWN Studio

Oscillating cylinders recreated with AniMate

Expert Evaluation

We conducted an expert evaluation consisting of two tasks with three professional animators. In the first task, we asked them to reproduce an animation we previously created without telling them how we achieved it. The animators successfully produced similar animations within 15 minutes, using both similar and different strategies then the ones we used.

Animation created by authors

Reproduced by animator Rima Cao

In the second task, we asked the animators to create a unique animation using the AniMate software.

Our evaluation revealed that expert animators saw natural language interaction for animation as a way to quickly generate low fidelity actions. They also found benefits in having narrative motion block represent and implement multiple targeted actions simultaneously, such as scale and rotate, and provide a straightforward way to literally read the timeline and understand it. While they found that the system didn’t provide all the parameters they wish they had access to, they appreciated that the dynamically generated panel enabled them to focus on one action at a time.

research

installation

performance

material