To accelerate the transition to memory safe programming languages, the US Defense Advanced Research Projects Agency (DARPA) is driving the development of TRACTOR, a programmatic code conversion vehicle.

The term stands for TRanslating All C TO Rust. It’s a DARPA project that aims to develop machine-learning tools that can automate the conversion of legacy C code into Rust.

The reason to do so is memory safety. Memory safety bugs, such buffer overflows, account for the majority of major vulnerabilities in large codebases. And DARPA’s hope is that AI models can help with the programming language translation, in order to make software more secure.

“You can go to any of the LLM websites, start chatting with one of the AI chatbots, and all you need to say is ‘here’s some C code, please translate it to safe idiomatic Rust code,’ cut, paste, and something comes out, and it’s often very good, but not always,” said Dan Wallach, DARPA program manager for TRACTOR, in a statement.

  • MajorHavoc@programming.dev
    link
    fedilink
    arrow-up
    87
    arrow-down
    4
    ·
    edit-2
    2 months ago

    “You can go to any of the LLM websites, start chatting with one of the AI chatbots, and all you need to say is ‘here’s some C code, please translate it to safe idiomatic Rust code,’ cut, paste, and something comes out, and it’s often very good, but not always,” said Dan Wallach, DARPA program manager for TRACTOR, in a statement.

    “This parlor trick impressed me. I’m sure it can scale to solve difficult real world problems.”

    It’s a promising approach worth trying, but I won’t be holding my breath.

    If DARPA really wanted safer languages, they could be pushing test coverage, not blindly converting stable well tested C code into untested Rust code.

    This, like most AI speculation, reeks of looking for shortcuts instead of doing the boring job at hand.

    • thingsiplay@beehaw.org
      link
      fedilink
      arrow-up
      10
      ·
      2 months ago

      Also:

      As to the possibility of automatic code conversion, Morales said, “It’s definitely a DARPA-hard problem.” The number of edge cases that come up when trying to formulate rules for converting statements in different languages is daunting, he said.

    • ByteOnBikes@slrpnk.net
      link
      fedilink
      arrow-up
      7
      ·
      2 months ago

      I’m thinking they also want to future proof this.

      The quantity of C devs are dying. It’s a really difficult language to get competent with.

      • FizzyOrange@programming.dev
        link
        fedilink
        arrow-up
        2
        ·
        1 month ago

        Ada is not strictly safer. It’s not memory safe for example, unless you never free. The advantage it has is mature support for formal verification. But there’s literally no way you’re going to be able to automatically convert C to Ada + formal properties.

        In any case Rust has about a gazillion in-progress attempts at adding various kinds of formal verification support. Kani, Prusti, Cruesot, Verus, etc. etc. It probably won’t be long before it’s better than Ada.

        Also if your code is Ada then you only have access to the tiny Ada ecosystem, which is probably fine in some domains (e.g. embedded) but not in general.

    • sudo42@lemmy.world
      link
      fedilink
      English
      arrow-up
      2
      ·
      1 month ago

      A: “We really need this super-important and highly-technical job done.”
      B: “We could just hire a bunch of highly-technical people to do it.”
      A: “No, we would have to hire people and that would cost us millions.”
      B: “We could spend billions on untested technology and hope for the best.”
      A: “Excellent work B! Charge the government $100M for our excellent idea.”

  • Mischala@lemmy.nz
    link
    fedilink
    arrow-up
    51
    arrow-down
    1
    ·
    2 months ago

    turning C code automatically into Rust…

    Oh wow they must have some sick transpiler, super exciting…

    With AI, of course

    God fucking damnit.

  • Vivendi@lemmy.zip
    link
    fedilink
    arrow-up
    42
    arrow-down
    3
    ·
    edit-2
    2 months ago

    Code works in C

    Want to make it safer

    Put it into a fucking LLM

    You know sometimes I wonder if I’m an idiot or that maybe I just don’t have the right family connections to get a super high paying job

    • douglasg14b@programming.dev
      link
      fedilink
      arrow-up
      3
      arrow-down
      1
      ·
      edit-2
      1 month ago

      Too bad commenters are as bad as reading articles as LLMs are at handling complex scenarios. And are equally as confident with their comments.

      This is a pretty level headed, calculated, approach DARPA is taking (as expected from DARPA).

  • antihumanitarian@lemmy.world
    link
    fedilink
    English
    arrow-up
    26
    ·
    1 month ago

    Key detail in the actual memo is that they’re not using just an LLM. “Wallach anticipates proposals that include novel combinations of software analysis, such as static and dynamic analysis, and large language models.”

    They also are clearly aware of scope limitations. They explicitly call out some software, like entire kernels or pointer arithmetic heavy code, as being out of scope. They also seem to not anticipate 100% automation.

    So with context, they seem open to any solutions to “how can we convert legacy C to Rust.” Obviously LLMs and machine learning are attractive avenues of investigation, current models are demonstrably able to write some valid Rust and transliterate some code. I use them, they work more often than not for simpler tasks.

    TL;DR: they want to accelerate converting C to Rust. LLMs and machine learning are some techniques they’re investigating as components.

  • AlexWIWA@lemmy.ml
    link
    fedilink
    English
    arrow-up
    21
    ·
    2 months ago

    It’d be nice if they open source this like they did with ghidra. The video game reverse engineering and modernization efforts have been much easier thanks to the government open sourcing their tools

    • zaphod@sopuli.xyz
      link
      fedilink
      arrow-up
      20
      ·
      2 months ago

      I threw some simple code at it and it even put unsafe on the main function, what’s the point of Rust then if everything is unsafe?

        • ulterno@lemmy.kde.social
          link
          fedilink
          English
          arrow-up
          1
          ·
          1 month ago

          And I hope that’s not someone who doesn’t understand the static keyword after 2+ years of C++ development.

      • The_Decryptor@aussie.zone
        link
        fedilink
        English
        arrow-up
        5
        ·
        2 months ago

        Ideally you don’t directly ship the code it outputs, you use it instead of re-writing it from scratch and then slowly clean it up.

        Like Mozilla used it for the initial port of qcms (the colour management library they wrote for Firefox), then slowly edited the code to be idiomatic rust code. Compare that to something like librsvg that did a function by function port

      • JackbyDev@programming.dev
        link
        fedilink
        English
        arrow-up
        4
        ·
        1 month ago

        Baby steps. It’s easier to convert code marked unsafe in Rust to not need unsafe than it is convert arbitrary code in other languages to Rust code that doesn’t need unsafe.

  • litchralee@sh.itjust.works
    link
    fedilink
    English
    arrow-up
    24
    arrow-down
    8
    ·
    edit-2
    2 months ago

    This is an interesting application of so-called AI, where the result is actually desirable and isn’t some sort of frivolity or grift. The memory-safety guarantees offered by native Rust code would be a very welcome improvement over C code that guarantees very little. So a translation of legacy code into Rust would either attain memory safety, or wouldn’t compile. If AI somehow (very unlikely) manages to produce valid Rust that ends up being memory-unsafe, then it’s still an advancement as the compiler folks would have a new scenario to solve for.

    Lots of current uses of AI have focused on what the output could enable, but here, I think it’s worth appreciating that in this application, we don’t need the AI to always complete every translation. After all, some C code will be so hardware-specific that it becomes unwieldy to rewrite in Rust, without also doing a larger refactor. DARPA readily admits that their goal is simply to improve the translation accuracy, rather than achieve perfection. Ideally, this means the result of their research is an AI which knows its own limits and just declines to proceed.

    Assuming that the resulting Rust is: 1) native code, and 2) idiomatic, so humans can still understand and maintain it, this is a project worth pursuing. Meanwhile, I have no doubt grifters will also try to hitch their trailer on DARPA’s wagon, with insane suggestions that proprietary AI can somehow replace whole teams of Rust engineers, or some such nonsense.

    Edit: is my disdain for current commercial applications of AI too obvious? Is my desire for less commercialization and more research-based LLM development too subtle? :)

    • KeriKitty (They(/It))@pawb.social
      link
      fedilink
      English
      arrow-up
      21
      arrow-down
      3
      ·
      2 months ago

      so-called AI

      knows its own limits

      frustration noises It knows nothing! It’s not intelligent. It doesn’t understand anything. Attempts to keep those things acting within expected/desired lines fail constantly, and not always due to malice. This project’s concept reeks of laziness and trend-following. Instead of a futile effort to make a text generator reliably produce either an error or correct code, they should perhaps put that effort into writing a transpiler built on knowable, understandable rules. … Oh, and just hire a damn Rust dev. They’re climbing up the walls looking to Rust-ify everything, just let them do it.

    • Joey@programming.dev
      link
      fedilink
      arrow-up
      7
      ·
      2 months ago

      So a translation of legacy code into Rust would either attain memory safety, or wouldn’t compile.

      They’d probably have to make sure it doesn’t use the unsafe keyword to guarantee this.

  • echindod@programming.dev
    link
    fedilink
    arrow-up
    2
    ·
    2 months ago

    Using an LLM to come up with function names for transpiled code would be a good idea, but other than that. Nope.