I'm so incredibly, debilitatingly tired of this recurring syndrome
in the demoscene
wherein someone innocent and wide-eyed shows up and expresses some
interest about getting started with demo coding, and the immediate
reaction is to stick a pointy-clicky demotool in their hands and
call it a day, and then the net result is a demo compo where half
the entries run on Unity or Unreal, and the other half don't run
at all.
This is ridiculous, this is embarrassing, this is unsustainable,
and worst of all we've
been here before. To this end, I decided to stick my money where
my mouth is by providing you a quick walkthrough on how to render a mesh on screen using
modern, reliable methods, in about ~40 steps and less than 400 lines
of code, in a perhaps futile attempt to a) demonstrate that starting
actual demo coding isn't as hard as people make it sound and
b) make people care about - let's remember - what
we told UNESCO
this stuff is supposed to be about.
As you can tell, this is a tutorial intended for people who can already
program in C++, and some linear algebra and basic understanding of
rendering is recommended.
We'll be using C++ and DirectX 11 to render stuff, the simple reason
being that we would very much like our demo to work - you can play
around with OpenGL if you must, but let's face it: you're never going
to compile for Mac or Linux, and the audience you lose there is a fraction
of what you lose by making your demo GPU-vendor specific.
(If you really really must, the Internet is chock full with
SDL tutorials.)
With that in mind, let's go.
The basics
I'm not even going to explain this; if this step stumps you already
you're probably better off at
Codecademy
or something.
Opening a window
Right, so the first step in our plumbing process is going to be opening
a window through WINAPI; this is a fairly well documented process so
we're going to just go through it quickly because let's face it - you
can waste all the time in the world on this to make it nice and shiny,
but it doesn't actually matter. You can do that later once everything else
is in place.
First, we include windows.h, which contains the declarations
for all the stuff that we'll be using.
Then we create a "window class"; this basically defines the behavior of
our window, but the only thing we really care about right now is the
window procedure (which we'll call WndProc),
which is a callback that defines how a window reacts
when something happens; if you want to learn more,
the official documentation
is always there.
Our window procedure is simple as a stick: we have a
gWindowWantsToQuit flag that signals if the user wants to quit,
and we set it to true if either someone pressed Escape or the window close
button. We also disable the screensaver.
Finally we open a window where the client area (i.e. without the title
bar and border) is 1280x720. Again,
read the docs if you're curious.
Here's our main loop, this is what we will continually be running until
user input tells us to stop: Right now it does nothing except ensure
that everything that happens to the window is processed (as usual, docs,
but hilariously even Wikipedia),
but eventually everything that's not loading will happen here.
Finally we deallocate everything because we're not lazy slobs.
At this point if you compile and run this you should get a nice big
solid colored window that does nothing except close when you press Esc.
So far so good.
Adding music replay
We have our window, so let's add some audio first; we're going to use
Miniaudio for this because we're not going to bother writing our own
MP3 player. (You can do that later.)
Now, you may wonder what #define MINIAUDIO_IMPLEMENTATION
does - don't. You can wonder about that later.
Next, we initialize the sound player and load our music (supplying the MP3
file is left up to the reader); once again, there's docs
and that's all you need to know.
Then in our mainloop, we continually poll the timer of where our music
replay is (in seconds), so that we can use that to synchronize our demo.
There's nothing to synchronize just yet, of course.
(I'm contractually obligated to mention that Miniaudio's timer isn't
perfect and might be lower resolution than desired; you can start
worrying about this once it becomes a problem.)
And again, we deallocate everything once we're done.
So now our empty window should have sound. Cool, let's keep on going.
Initializing DirectX
Now it's time for the first bigger lift: Connecting DirectX 11.0 to our
window.
First, the usual includes; generally #pragma-ing a library
in a source file is considered bad form, but we're keeping everything
in a single file just for completeness sake - otherwise, you know how
to use linker settings, right...?
Next, we initialize Direct3D 11 by telling it what size window we have
and how we want it to work; as usual, there's docs
and you'll undoubtably learn about what a swap chain is, but for now,
you don't need to care beyond "thing that puts picture on screen".
We then fetch our back buffer - the screen buffer we're rendering into
- from D3D, and create a render target view for it; "views" are
essentially D3D11's way of representing a buffer to a subsystem,
so e.g. a texture can have a "shader view" so that shaders can use it,
but also a "render target view" so that you can use it as a render target.
This may sound complicated, but it will all make sense in the end.
Now, because we're hoping to render some 3D, we'll also create a
depth-stencil texture (if you don't know what a Z-buffer is,
just read Wikipedia)
and a depth stencil view - see above. This way, stuff that's up front
will correctly occlude what's in the back.
This concludes our initialization phase, so we can move on to rendering.
...Okay, so rendering won't be too exciting for now, since we have nothing
to render yet, but we can clear the screen with an arbitrary color and
that's good enough so far. We're going to use a non-black colour here
just so that we can see if we've done anything - you can use whatever
you like.
We're also clearing the depth buffer, but that won't have much of an
effect until later.
(Sidequest: You can totally animate the clear color with the music
position, if you feel so inspired.)
Once we're done with our "rendering" (we'll get there, promised),
we just tell the swap chain to "present" our buffer, i.e. put it on
screen.
...And we deallocate because we're wonderful people.
With that, our window now has color - sweet! Let's move on to
loading a mesh.
Loading a mesh
Now we're going to load a mesh from a GLTF file
- why that format you may ask? Because it's as good as any.
We're going to use cgltf,
so get the file
and #include it; don't worry about #define _CRT_SECURE_NO_WARNINGS,
just trust me.
The first thing we need to do is create a rasterizer state
because reasons I'll explain later.
Next, we load the mesh - again, docs are your friend.
(As usual, supplying the mesh is an exercise left to the reader, but I heard
Blender is free now, so chop-chop!)
Once the mesh is loaded, we create a vertex and index buffer
that houses our mesh data; really all we're doing is taking our mesh data
provided by our the buffer views in cgltf and telling D3D11 to
create a vertex buffer
out of it; index buffers work on the same principle, pretty much.
And we deallocate, as we should have gotten used to it now.
You'll note that we haven't rendered anything - that's because we
can't really render anything without shaders. That's next.
Adding shaders
So, shaders, then.
Shaders
are probably the most misunderstood part of the rendering
toolkit, thanks in no small part to Shadertoy, that led way too many people
to believe that the only way to render 3D with a computer is by combining
crazy unintuitive maths into a signed distance field function (and then
wondering why the frame-rate drops through the floor in fullscreen).
Rest assured, while there's useful stuff to learn from there,
that use of shaders is niche, and should not be taken as gospel.
Short version: no, that's not what shaders are. Long version is that
there are many shader types from vertex through compute to pixel/fragment,
and all they do is provide a way to run functions to produce that given
vertex or pixel when the rendering pipeline reaches that stage -
a vertex shader's job is to transform the raw mesh data onto their final
position on the screen, while a pixel shader's job is to calculate
the final color of a pixel on screen as it is being rendered.
At the bare minimum, we will need a vertex and a pixel shader to get
anything on screen, so that's what we will use.
First, we include D3D11's shader compiler.
This is going to be our super basic shader for now - as you can see
all it does is take the input vertex position, and then sends it through
to the output immediately, and the pixel shader literally just paints
white - we'll iterate on these later, but let's get our plumbing
done first.
(Before you ask: Yes, you can have the shader in a separate text file if you want to,
we're just including here as a string to, as said before,
keep everything in a single source file for demonstration's sake.
Incidentally, running your demo in a window and reloading the shader file
if it has changed is the most basic version of a "demotool".)
For both shaders, we compile them, telling the compiler which entry
point to look for and in what shader version
(docs)
and once it has compiled it, we use the resulting blob to create the
actual shader. We do this for both the vertex and pixel shader.
We also create an "input layout"
where we effectively tell the system the format (or "layout") of our vertex buffer and
match it to our vertex shader; here we're telling it that our vertex
only consists of a single position field, and that the position is three (X, Y, Z)
floating point numbers.
Now finally we get to actually render some stuff -
the code looks like a lot of fluff, but it's
really not that bad once you get the hang of it:
First we set the viewport; this tells the API which part of the
screen to render on - we're gonna render to the whole screen.
Then we set the render target, i.e. which buffer to render to -
we only have a back buffer, so choose that, but this is where we
could also be rendering into a texture.
Next, we set our two shaders, the rasterizer state and our input layout.
We only have one vertex buffer, so we set that as well - it's
possible to have more than one, but let's not care about that for now.
We also set our index buffer - indices can be 16 or 32 bit, so we decide based on the data cgltf provided.
We also tell the API that our buffers are a list of triangles -
there are other kinds,
but they're not important most of the time.
And finally we perform the actual rendering - whee!.
See? Not that bad after all!
(Worth mentioning if it isn't obvious that you don't need to set all of this
for every single rendered object, you only need to set stuff that changes
inbetween Draw calls.)
And before we go, we clean up.
So now if you run the whole thing, you should get a white,
static, stretched mesh; not great, but we rendered something!
With the GPU! Yay!
Adding some animation
Next up we're going to add some animation, and fix the stretchiness.
For that, we're going to use some vector maths, and for that
we'll be using a math library called
ccVector.
First we're going to create a "constant buffer":
Think of this as shared memory between your program and the shader - you
write into it, the shader can read it.
For now, we're going to put two 4x4 matrices into it: a projection matrix
and a world transform (or object) matrix. Since we have two of them, our buffer
will be 16 * 2 = 32 floats.
We then add the constant buffer declaration to the shader, and use it to transform our vertices;
first from object-space to world-space using the world matrix,
and then from world-space to screen-space using the projection matrix.
In our rendering loop, to the first half of the buffer, we create a
standard perspective matrix: our field-of-view angle will be quarter PI
(= 45 degrees), we calculate the aspect ratio from our window size, and
our near and far plane distances are just arbitrary values that work
for this scenario.
(Again, docs.)
For the second matrix, we just rotate it along the Y (vertical) axis,
and move it slightly forward so that it doesn't clip into our viewpoint
(which is at world zero).
(You may wonder if you constantly have to hardwire your data into specific
positions of the constant buffer: no, you don't have to, that's where
shader reflection
comes in.)
Let's also go back to our rasterizer state and explain why it's useful:
The problem is that our chosen math library provides matrices that are
right-handed
(i.e. Y increases up and Z increases backwards), but cgltf provides
meshes that are left-handed (i.e. Z increases forwards), and
this causes our triangles to face the wrong way and disappear
- so we use the rasterizer state to flip this setting to what we want.
Think this is annoying? Welcome to demo coding.
All that's left is copy the our (local) constant buffer into
the memory of the GPU, and then tell the vertex shader to use it.
...And to clean up.
So now if you run our executable, our mesh is no longer stretched,
and rotates - probably. Kinda hard to tell. So all that's left is
to texture it.
Adding a texture
Like with everything so far, we're not going to write our own image
decompressor, instead we're just going to use
the legendary STB libraries.
(Once again, providing a texture is an exercise left to the reader.)
To load the texture to memory, simply give it the filename, and a few
variables that we'll be using later.
Note that we're freeing the memory early, before our mainloop
- this is because once we moved the texture data from our memory to
graphics memory, our copy isn't needed anymore.
Next up we create a texture
using the pixel data we have just received from the STB library,
and we create a shader resource view for it - you'll remember the views
from earlier, and we use this here so that we can use this texture in
our shader.
The only thing that may need explanation is the SysMemPitch
field, but the docs
should make its purpose obvious.
In our shader, first we generate a texture coordinate in our VS - these aren't
real, artist-assigned texture coordinates, just an ugly projection, but for proving that
stuff works they'll work fine - and for each rendered pixel in our PS, we sample
our new texture and return it as our final rendered color.
In our rendering code, the only change is that we assign the texture
as our pixel shader resource (since it's the pixel shader using it).
And we deallocate.
And that's it.
You have a model you loaded from a file that animates, it has a texture
on it, you have music that plays, and the whole thing even quits on
Escape - what more would you want?
...In fairness, probably a whole lot more - fullscreen, vsync, cameras,
mipmaps, real texture coordinates and normals, multiple objects, keyframed
animation, lighting, postprocessing, and so on - but that's where the
real fun begins where you get to pick and choose what features you get to
write for your future demo, and slowly, incrementally build your
own engine / framework / toolset.
I'm only giving you the runway here. The whole "flying" part you have to
do yourself.