Siggraph Presentation

This guide will be officially introduced at Siggraph 2023 - Houdini Hive on Wednesday, 9. of August 2023 at 11:00 AM PST.

Particles

Importing particles (points) is the simplest form of geometry import.

Let's see how we can make it complicated 🤪. We'll take a look at these two cases:

  • Houdini native point import
  • Render overrides via (Numpy) Python wrangles

You can find all the .hip files of our shown examples in our USD Survival Guide - GitHub Repo.

For all options for SOP to LOP importing, check out the official Houdini docs.

Houdini Native Import

When importing points, all you need to set is a path attribute on your points (rather than on prims as with polygon meshes), because we don't have any prims on sop level. (Thanks captain obvious).

For an exercise, let's build a simple SOP import ourselves. Should we use this in production: No, Houdini's geo conversion is a lot faster, when it comes to segmenting your input based on the path attribute. Nonetheless it is a fun little demo:

import numpy as np
from pxr import UsdGeom
sop_node = node.parm("spare_input0").evalAsNode()
sop_geo = sop_node.geometry()

frame = hou.frame()

prim = stage.DefinePrim("/points", "Points")
attribute_mapping = {
    "P": "points",
    "id": "ids",
    "pscale": "widths",
    "Cd": "primvars:displayColor",
}
# Import attributes
for sop_attr in sop_geo.pointAttribs():
    attr_name = attribute_mapping.get(sop_attr.name())
    if not attr_name:
        continue
    attr = prim.GetAttribute(attr_name)
    if not attr:
        continue
    attr_type_name = attr.GetTypeName()
    attr_type = attr_type_name.type
    attr_class = attr_type.pythonClass
    # The pointFloatAttribValuesAsString always gives us a float array
    value = np.frombuffer(sop_geo.pointFloatAttribValuesAsString(sop_attr.name()), dtype=np.float32)
    # This casts it back to its correct type
    attr.Set(attr_class.FromNumpy(value), frame)
    # Enforce "vertex" (Houdini speak "Point") interpolation
    attr.SetMetadata("interpolation", "vertex")
# Re-Compute extent hints
boundable_api = UsdGeom.Boundable(prim)
extent_attr = boundable_api.GetExtentAttr()
extent_value = UsdGeom.Boundable.ComputeExtentFromPlugins(boundable_api, frame)
if extent_value:
    extent_attr.Set(extent_value, frame)

Why should we do it ourselves, you might ask? Because there are instances, where we can directly load in the array, because we know we are only editing a single prim. Some of Houdini's nodes actually use this mechanism in a few of the point instancer related nodes.

Render-Time Overrides via (Numpy) Python Wrangles

Now you might be thinking, is Python performant enough to actually manipulate geometry?

In the context of points (also point instancers), we answer is yes. As we do not have to do geometry operations, manipulating points is "just" editing arrays. This can be done very efficiently via numpy, if we use it for final tweaking. So don't expect to have the power of vex, the below is a "cheap" solution to adding render time overrides, when you don't have the resources to write your own compiled language (looking at your DNEG (OpenVDB AX)).

In the near future, this can probably also be done by Houdini's render procedurals (it actually already can be). The method we show here, is DCC independent though, so it does have its benefits, because you don't need a Houdini (engine) license.

To showcase how we manipulate arrays at render time, we've built a "Python Wrangle" .hda. Here is the basic .hda structure:

As discussed in our Creating efficient LOPs .Hdas section, we start and end the Hda with a new layer to ensure that we don't "suffer" from the problem of our layer getting to data heavy. Then we have two python nodes: The first one serializes the Hda parms to a json dict and stores it on our point prims, the second one modifies the attributes based on the parm settings. Why do we need to separate the data storage and execution? Because we want to only opt-in into the python code execution at render-time. So that's why we put down a switch node that is driven via a context variable. Context variables are similar to global Houdini variables, the main difference is they are scoped to a section of our node graph or are only set when we trigger a render.

This means that when rendering the USD file to disk, all the points will store is our wrangle code (and the original point data, in production this usually comes from another already cached USD file that was payloaded in). In our pre render scripts, we can then iterate over our stage and execute our code.

Let's talk about performance: The more attributes we manipulate, the slower it will get. To stress test this, let's try building a point replicator with a constant seed. To "upres" from 1 million to 10 million points, it takes around 30 seconds. For this being a "cheap" solution to implement, I'd say that is manageable for interactivity. Now we could also do a similar thing by just using a point instancer prim and upres-ing our prototypes, using this method allows for per point overrides though, which gives us more detail.

Here is a demo of a point replicator:

Another cool thing is, that this is actually not limited to points prims (We lied in our intro 😱). Since all attributes are is arrays of data, we can run the python wrangle on any prim. For example if we just wan't to increase our pscale width or multiply our velocities, operating on an array via numpy is incredibly fast, we're talking a few 100 milliseconds at most for a few millions points. As mentioned in our data types section, USD implements the buffer protocol, so we don't actually duplicate any memory until really needed and mapping Vt.Arrays to numpy is as straight forward as casting the array to a numpy array.

Now the below code might look long, but the import bits are:

  • Getting the data: np.array(attr.Get(frame)
  • Setting the data: attr.Set(attr.GetTypeName().type.pythonClass.FromNumpy(output_data[binding.property_name]), frame))
  • Updating the extent hint: UsdGeom.Boundable.ComputeExtentFromPlugins(boundable_api, frame)

Python Wrangle Hda | Summary | Click to expand!

    ...
    # Read data
    input_data[binding.property_name] = np.array(attr.Get(frame))
    ...
    # Write data
    for binding in bindings:
        attr = prim.GetAttribute(binding.property_name)
        if len(output_data[binding.property_name]) != output_point_count:
            attr.Set(pxr.Sdf.ValueBlock())
            continue
        attr_class = attr.GetTypeName().type.pythonClass
        attr.Set(attr_class.FromNumpy(output_data[binding.property_name]), frame)
    ...
    # Re-Compute extent hints
    boundable_api = pxr.UsdGeom.Boundable(prim)
    extent_attr = boundable_api.GetExtentAttr()
    extent_value = pxr.UsdGeom.Boundable.ComputeExtentFromPlugins(boundable_api, frame)
    if extent_value:
        extent_attr.Set(extent_value, frame)

The code for our "python kernel" executor:

Python Wrangle Hda | Python Kernel | Click to expand!

import pxr
import numpy as np

class Tokens():
    mode = "vfxSurvivalGuide:attributeKernel:mode"
    module_code = "vfxSurvivalGuide:attributeKernel:moduleCode"
    execute_code = "vfxSurvivalGuide:attributeKernel:executeCode"
    bindings = "vfxSurvivalGuide:attributeKernel:bindings"

class Binding():
    fallback_value_type_name = pxr.Sdf.ValueTypeNames.FloatArray
    def __init__(self):
        self.property_name = ""
        self.variable_name = ""
        self.value_type_name = ""

class Point():
    def __init__(self, bindings):
        self.ptnum = -1
        self.bindings = bindings
        for binding in self.bindings:
            setattr(self, binding.variable_name, None)
                
class Points():
    def __init__(self):
        self.bindings = []
        for binding in self.bindings:
            setattr(self, binding.variable_name, [])
          
def run_kernel(stage, frame):
    # Process
    for prim in stage.Traverse():
        attr = prim.GetAttribute(Tokens.bindings)
        if not attr:
            continue
        if not attr.HasValue():
            continue

        mode = prim.GetAttribute(Tokens.mode).Get(frame)
        module_code = prim.GetAttribute(Tokens.module_code).Get(frame)
        execute_code = prim.GetAttribute(Tokens.execute_code).Get(frame)
        bindings_serialized = eval(prim.GetAttribute(Tokens.bindings).Get(frame))
        
        # Bindings
        bindings = []
        for binding_dict in bindings_serialized:
            binding = Binding()
            binding.property_name = binding_dict["property_name"]
            binding.variable_name = binding_dict["variable_name"]
            binding.value_type_name = binding_dict["value_type_name"]
            bindings.append(binding)
        # Kernel
        module_code_compiled = compile(module_code, "module_code", "exec")
        execute_code_compiled = compile(execute_code, "code", "exec")
        exec(module_code_compiled)
        # Read data
        input_data = {}
        input_point_count = -1
        output_data = {}
        output_point_count = -1
        initialize_attrs = []
        for binding in bindings:
            # Read attribute or create default fallback value
            attr = prim.GetAttribute(binding.property_name)
            if not attr:   
                value_type_name_str = binding.fallback_value_type_name if binding.value_type_name == "automatic" else binding.value_type_name
                value_type_name = getattr(pxr.Sdf.ValueTypeNames, value_type_name_str)
                attr = prim.CreateAttribute(binding.property_name, value_type_name)
            if attr.HasValue():
                input_data[binding.property_name] = np.array(attr.Get(frame))
                input_point_count = max(input_point_count, len(input_data[binding.property_name]))
            else:
                initialize_attrs.append(attr)
            # Enforce interpolation to points
            attr.SetMetadata(pxr.UsdGeom.Tokens.interpolation, pxr.UsdGeom.Tokens.vertex)
            output_data[binding.property_name] = []
        for initialize_attr in initialize_attrs:
            input_data[initialize_attr.GetName()] = np.array([initialize_attr.GetTypeName().scalarType.defaultValue] * input_point_count)
        # Utils
        def npoints():
            return input_point_count
        # Modes
        if mode == "kernel":
            # Kernel Utils
            points_add = []
            points_remove = set()
            def create_point():
                point = Point(bindings)
                points_add.append(point)
            def copy_point(source_point):
                point = Point(source_point.bindings)
                for binding in point.bindings:
                    setattr(point, binding.variable_name, np.copy(getattr(source_point, binding.variable_name)))
                points_add.append(point)
                return point
            def remove_point(point):
                points_remove.add(point.ptnum)
            # Kernel
            point = Point(bindings)
            for elemnum in range(len(list(input_data.values())[0])):
                point.ptnum = elemnum
                for binding in bindings:
                    setattr(point, binding.variable_name, input_data[binding.property_name][elemnum])
                # User Kernel Start
                exec(execute_code_compiled)
                # User Kernel End
                if points_remove and point.ptnum in points_remove:
                    continue
                for binding in bindings:
                    output_data[binding.property_name].append(getattr(point, binding.variable_name))
            for binding in bindings:
                for point_add in points_add:
                    output_data[binding.property_name].append(getattr(point_add, binding.variable_name))
            for binding in bindings:
                output_data[binding.property_name] = np.array(output_data[binding.property_name])
                output_point_count = max(output_point_count, len(output_data[binding.property_name]))
        elif mode == "array":
            points = Points()
            for binding in bindings:
                setattr(points, binding.variable_name, input_data[binding.property_name])
            # User Kernel Start
            exec(execute_code_compiled)
            # User Kernel End
            for binding in bindings:
                output_data[binding.property_name] = getattr(points, binding.variable_name)
                output_point_count = max(output_point_count, len(output_data[binding.property_name]))
        # If the output is invalid, block it to prevent segfaults
        if input_point_count != output_point_count:
            for attr in prim.GetAttributes():
                if attr.GetName() in input_data.keys():
                    continue
                if not attr.HasValue():
                    continue
                if not attr.GetTypeName().isArray:
                    continue
                if len(attr.Get(frame)) == output_point_count:
                    continue
                attr.Set(pxr.Sdf.ValueBlock())
        # Write data
        for binding in bindings:
            attr = prim.GetAttribute(binding.property_name)
            if len(output_data[binding.property_name]) != output_point_count:
                attr.Set(pxr.Sdf.ValueBlock())
                continue
            attr_class = attr.GetTypeName().type.pythonClass
            attr.Set(attr_class.FromNumpy(output_data[binding.property_name]), frame)
        # Re-Compute extent hints
        boundable_api = pxr.UsdGeom.Boundable(prim)
        extent_attr = boundable_api.GetExtentAttr()
        extent_value = pxr.UsdGeom.Boundable.ComputeExtentFromPlugins(boundable_api, frame)
        if extent_value:
            extent_attr.Set(extent_value, frame)

The code for our pre render script:

Python Wrangle Hda | Pre-Render Script | Click to expand!

import os
import sys
from imp import reload
# Python Wrangle Module
dir_path = os.path.dirname(hou.hipFile.path())
if dir_path not in sys.path:
    sys.path.insert(0, dir_path)
import pythonWrangle
reload(pythonWrangle)
from pythonWrangle import run_kernel

frame = hou.frame()
run_kernel(stage, frame)