Technical overview

Project goals.

The Rational Autonomous Cybernetic Commandos (RACC) project aims at implementing a totally Rational Artificial Intelligence in cybernetic non-player characters (NPCs or "bots") able to behave as autonomous entities which main characteristics are real-time environmental analysis and squad intelligence of commandos.

About non-cognitive abilities of the AI.

The instinctive side of an intelligence constitutes the sum of senses, means and actions the being is capable without these all to be driven from the reason. Instinct represents the automatic mechanical dimension of an individual's capabilities. It is for biological systems a nervous phenomenon which behavior is subject to rules established at the genetic level. It will be for cybernetic systems a read-only program which behavior will be subject to rules established by the programmer in a way to simulate the known instinctive behavior of biological systems, for instance Man.

     Program is genom for Artificial Intelligence.
     Genom is program for natural intelligence.
     Both are Turing machines.

The robot is then an anthropomorphic living being (or, relatively to its universe, "player-o-morphic"). The non-cognitive abilities of this AI declinate in three domains: sensing, action and navigation.

Sensing and sensitivity of a cybernetic individual.

The virtual world of first-person shooter games provides three sensing vectors, respectively by order of vital importance sight, hearing and touch. These must be implemented in a way as realistic as possible in the instinctive program of the AI. Of their quality depend the capabilities of adaptation and learning, that's to say the survival capability, of the cybernetic individual.

     Sight is implemented by spatial consideration. A parallelization with the way of functioning of a human eye reveals an identical principle of function, the ray caster, or line tracer, in charge for each angle considered in the field of view to quantify the distance between the individual and its nearest obstacle in that direction. The cybernetic being acquires then faculties of spatial consideration of its universe comparable to those of the human being. Time-based quantification of human vision allows 15 discernable captures per second. Thus, the bot should not be able to sample its field of view more than 15 times per second.
     This ray caster successively emits TraceLine bursts (test lines. Cf. ID Software's 3D engines technical documentation) from the location of the character according to an angle which is incremented at each iteration. The distances which are gotten this way invest an array in which the median index matches the direction the bot is facing.
     The entity list is provided by the engine upon asking, under the form of an indexed list. For each of those a test is made upon their respective location in order to determine whether the entity in question is potentially visible, in which case an additional TraceLine is fired to test the LOS (Line Of Sight, whether it is clearly visible or not). If the result is positive, the entity is held in the visible entities set and replaced in its context, like here below, the player in the background. A request is then made, either directly through this entity's own pointer, or by calling a specialized function of the engine, to acquire its visual parameters (illumination, size, model, etc).
     This as a whole constitutes a rational model of the human eye.

Compared sight: human model

Compared sight: AI model

     Hearing is implemented through both time-based and spatial consideration. Ideally, it is about software-based computing, for each date t of the virtual time, the noise intensity potentially heard by the cybernetic character by summing the values of the sound samples being played at this date t. The sound wave gotten this way represents the ambient noise, of which it is then possible to identify disturbing or surprising sound events by a simple derivation of this graph. The game engine technology does not permit to acquire for each instant this noise intensity, but limits itself to flag, at the relevant dates, the start of processing of a sound sample.
     The human model will therefore be inaccurately reproduced only. The selected solution consists in computing beforehand, the duration and the average loudness of each sample, and to alter in real-time, the auditive floor in consequence of it. We then get a time-based graph of which the high values mean a great loudness intensity, where the hearing system is meant to reach a ceiling (ignore then additional sounds which mix too well with the ambient noise), and whose peaks in the derivation mean surprise, which effect will range from a jump to a turn round in that direction, of intensity proportional to the surprise coefficient itself.
     This is a rational model of the human ear.

Compared hearing: human model

Compared hearing: AI model

Touch is, in that type of virtual world, a feeling acquired by proxy. The machine-human interface being only composed of a screen and loudspeakers, the touch sense can only be transmitted through these two vectors.
The game engine simulates, for the player to consider it so, the touch sense firstly in an auditive manner: footstep sounds, landing sounds, hit sounds. These sound samples are therefore referenced and are subject to a special processing in the hearing part of the AI. The results of such a processing has to be amplified when the touch sense is equally felt through a visual vector: screen hop (flinch), wound (damage), blood. In which case it is relevant to adjust a panic factor in consequence, which variation of intensity is direct function of the intensity of combined flinch and damage, and whose time-based variation is function of the flinch duration. Arising to a certain floor, the panic factor (which can safely be confused with self-confidence) triggers a survival reaction (flight), by taking control, until restoration of a normal cognitive thinking cycle, of the last non-cognitive ability of the AI: navigation.

About the means of action of the cybernetic individual.

The standard human-machine interface between man and his controlled avatar in the virtual world (the player), is commonly made up of a mouse and a keyboard.

We can modelize the movement of the mouse under the action of a human hand from a point on screen whose coordinates are (x₍₀₎, y₍₀₎) to a point whose coordinates are (x_(+¥), y_(+¥)) in function of time t measured in display frames (t Î N), according to the following logarithmic functions system (thanks to: Tobias "Killaruna" Heimann, Johannes "@$3.1415rin" Lampel) :

dy/dt = a·(dy/d(t-1) × e^{(a·log (s / 2))} + s·(y_(+¥)-y_(t)) × (1 - e^{(a·log (s / 2))}))
dx/dt = a·(dx/d(t-1) × e^{(a·log (s / 2))} + s·(x_(+¥)-x_(t)) × (1 - e^{(a·log (s / 2))}))

where a equals 20 times the duration in milliseconds of a frame.
where s is the reaction speed of the individual (empirical constant ranging from 0 to 1).

The reaction speed of the individual varies according to the level of experience and self-confidence.

It is moreover necessary to introduce a noise inherent in the x and y axes coordination:

dx/dt + c·(dy/dt)·((dx/dt)/|dx/dt|)
dy/dt + c·(dx/dt)·((dy/dt)/|dy/dt|)

where c is the accuracy coefficient inferred to the cybernetic individual.

The orientation of the cybernetic individual will thus be described using two vector axes and one pair of integers : the current axis, the ideal axis, and the movement speed of the cursor.

Identically, we can modelize action on any of the keyboard's keys by considering the two parameters that suffice to describe a minimal event of this type: the press date and release date of the key. A test is made each frame in order to know whether, among the allowed keyboard input channels, one of them should be activated or suspended. The keyboard/mice human-machine interface implemented so enables the AI to dispose of comparable action facilities to those the human player disposes of to control his software avatar.

Tactical waypointless navigation.

     To move around is for any biological system having such a freedom a vital instinctive need. It will be the same of the cybernetic individual.
     The navigational ability ideally declinates itself in two poles, respectively instinctive and cognitive. In the first quoted expresses the need to explore the environment; in the second express the conscious means that influence its performance: caution, tactics; the "tactics" naming regrouping, by opposition to "strategic", the sum of means consciously considered as the best ones, put to work in the achievement of a goal whose time-based validity does not exceed a near future of a few seconds.
     Such a model had to be simplified for technical order reasons, for it requires too much of a power of calculation the hardware can provide. The cybernetic individual has then been inferred the instinctive capability of tactical movement, traditionally located on the cognitive plane in the human being.

Any human being of normal physiological condition in an environment a priori free from danger, has in the absence of any cognitive thinking focus, a natural tendency to direct its look in the direction of the longest distance coverable by his field of view. This is a postulate of which I am author and only responsible, but that I believe relevant enough to be accepted as a basic axiom. The RACC preview already showed that a waypointless navigation based uniquely on this axiom and on only it, largely suffices for navigation in virtual worlds of relatively simple geometry. Therefore, the view focus during the movement of the cybernetic character adapts consequentively to the variations of its field of view.
A particular attention shall nevertheless be given to lateral intersections, for they are potentially representative of a greater coverable distance than the one focalized at this moment. The character can, using a few TraceLines, anticipate the uncovered passage and temporarily direct its view focus in the direction of the intersection, the overall process duration lasting at best a little second.

Navigation: view focus direction

Navigation: lateral anticipation

A rational use of the data collected by the eye of the cybernetic character enables the implementation of a tactical approach of movement. In an environment a priori dangerous, tactical navigation consists in moving from cover point to cover point avoiding as much as possible any direct exposition to the presumed source of danger.

- Any peak in the graph representing the field of view, represents a convex obstacle angle. If a TraceLine fired from the corresponding location to the source of threat returns a negative result, this point also represents a covered zone.
- Any peak in the derivation of the graph representing the field of view, represents a concave angle. Any index of this type represents a potential source of danger, for it indicates a lateral intersection. These angles shall be subject to particular surveillance.

The study of the instinctive movement reactions of an individual involved in urban warfare shows moreover that, during the time interval when the threat has not explicitly manifested its source yet, this one is commonly assimilated to any concave angle (i.e. lateral intersection) nearby, as long as the angle made with the vectors directed towards the objective from one hand and towards the suspicious intersection from another hand, does not exceed 90° in absolute value, this to guarantee the cybernetic character will never face back its objective.

Tactical navigation in action

Corresponding FOV graph

Pathfinding always require the symbolic division of the virtual universe, either in waypoints, or in walkable space zones (navmesh). Contrarily to the traditionally chosen direction during the programming of a moving entity, the second solution shall be retained, for it provides accurately the spatial dimension of the walkable zone, a thing waypoints don't allow, or in a very imprecise manner.
The walkable faces list is obtained through an interpretation of the BSP file where the totality of the polygons that make up a map in the virtual world are spatially described. The BSP format (Binary Space Partition) stores a descending tree describing the successive divisions that should be made on a single volume in order to reproduce all the geometry of the universe in non-concave polygons. The interpretation of it is made using botman's works, and notably his excellent BSP Tools. It is to note that this operation is realized only once per map, in order to build the corresponding data file.

     Each horizontal polygon, or subject to a slope not exceeding the maximal walkable value (45°), is listed as potentially walkable. Their reachability will be built in real-time by experience and monitoring. Each cybernetic individual acquires then the memory of its own moves, together with those of the other individuals it sees. A correctly filled bitmask enables to tell the difference between each type of reachability from one to another of the walkable polygons (ladder, jump, liquid, need to crouch, etc.)
     The connections system described above constitutes the navmesh (pseudo-cognitive map) of the AI, which can be parsed by an algorithm like A*, in order to determine a path according to a strategic choice depending on the situation (the shortest, the safest, etc.)
     NOTE : It is absolutely pointless to store danger, frequentation, or any other sort of information which would not be strictly objective in the structure of these walkable spaces. Storage of such subjective information is made in the individual's cognitive memory, which shall therefore be questioned during the pathfinding process.

Pathfinding: A*

About the cognitive abilities of AI.

[ this part has yet to be written ]

If you are interested in more detailed explanations about the source code more technical information is available in the original readme.txt botman provided with his HPB template. This is a complete tutorial for hacking the HPB_bot.

botman is also the author of a C++ tutorial that explains quite well the basis of classes and inheritances in C++. Even if knowledge of C is sufficient to hack the code out, programmers must remember that the interface with the Half-Life engine is written in C++.

The Rational Autonomous Cybernetic Commandos source code is free for all to look at and use. A strong emphasis has been put on correctness, cleanliness, readability and commenting.

Use the source, Luke!

Avtomat Kalashnikov AK-47

[ Back ]