Schematic of the overall framework. Given a task (e.g., an analogy to solve), inputs (denoted as {A, B, C, D}) are represented by grid codes, consisting of units (“grid cells”) representing different combinations of frequencies and phases. Grid embeddings (xA, xB, xC, xD) are multiplied elementwise by a set of learned attention weights w, then passed to a inference module R. The attention weights w are optimized using LDPP, which encourages attention to grid embeddings that maximize the volume of the representational space. The inference module outputs a score for each candidate analogy (consisting of A, B, C and a candidate answer choice D). The scores for all answer choices are passed through a softmax to generate an answer ŷ, which is compared against the target y to generate the task loss Ltask.